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Abstract 

We review the formalism and applications of non-linear perturbation theory (PT) 
to understanding the large-scale structure of the Universe. We first discuss the dy¬ 
namics of gravitational instability, from the linear to the non-linear regime. This 
includes Eulerian and Lagrangian PT, non-linear approximations, and a brief de¬ 
scription of numerical simulation techniques. We then cover the basic statistical 
tools used in cosmology to describe cosmic fields, such as correlations functions in 
real and Fourier space, probability distribution functions, cumulants and generat¬ 
ing functions. In subsequent sections we review the use of PT to make quantitative 
predictions about these statistics according to initial conditions, including effects 
of possible non Gaussianity of the primordial fields. Results are illustrated by de¬ 
tailed comparisons of PT predictions with numerical simulations. The last sections 
deal with applications to observations. First we review in detail practical estimators 
of statistics in galaxy catalogs and related errors, including traditional approaches 
and more recent developments. Then, we consider the effects of the bias between 
the galaxy distribution and the matter distribution, the treatment of redshift dis¬ 
tortions in three-dimensional surveys and of projection effects in angular catalogs, 
and some applications to weak gravitational lensing. We finally review the current 
observational situation regarding statistics in galaxy catalogs and what the future 
generation of galaxy surveys promises to deliver. 
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1 Introduction and Notation 


Understanding the large scale structure of the Universe is one of the main 
goals of cosmology. In the last two decades it has become widely accepted 
that gravitational instability plays a central role in giving rise to the remark¬ 
able structures seen in galaxy surveys. Extracting the wealth of information 
contained in galaxy clustering to learn about cosmology thus requires a quan¬ 
titative understanding of the dynamics of gravitational instability and appli¬ 
cation of sophisticated statistical tools that can best be used to test theoretical 
models against observations. 

In this work we review the use of non-linear cosmological perturbation theory 
(hereafter PT) to accomplish this goal. The usefulness of PT in interpreting 
results from galaxy surveys is based on the fact that in the gravitational insta¬ 
bility scenario density fluctuations become small enough at large scales (the 
so-called “weakly non-linear regime”) that a perturbative approach suffices 
to understand their evolution. Since early developments in the 80’s, PT has 
gone through a period of rapid evolution in the last decade which gave rise to 
numerous useful results. Given the imminent completion of next-generation 
large-scale galaxy surveys ideal for applications of PT, it seems timely to pro¬ 
vide a comprehensive review of the subject. 

The purpose of this review is twofold: 

1) To summarize the most important theoretical results, which are sometimes 
rather technical and appeared somewhat scattered in the literature with often 
fluctuating notation, in a clear, consistent and unified fashion. We tried in 
particular to unveil approximations that might have been overlooked in the 
original papers, and to highlight the outstanding theoretical issues that remain 
to be addressed. 

2) To present the state of the art observational knowledge of galaxy clustering 
with particular emphasis in constraints derived from higher-order statistics on 
galaxy biasing and primordial non-Gaussianity, and give a rigorous basis for 
the confrontation of theoretical results with observational data from upcoming 
galaxy catalogues. 

We assume throughout this review that the universe satisfies the standard 
homogeneous and isotropic big bang model. The framework of gravitational 
instability, in which PT is based, assumes that gravity is the only agent at 
large scales responsible for the formation of structures in a universe with 
density fluctuations dominated by dark matter. This assumption is in very 
good agreement with observations of galaxy clustering, in particular, as we 
discuss in detail here, from higher-order statistics which are sensitive to the 
detailed structure of the dynamics responsible for large-scale structures^. The 
non-gravitational effects associated with galaxy formation may alter the dis- 


1 As opposed to just properties of the linearized equations of motion, which can be 
mimicked by nongravitational theories of structure formation in some cases [10]. 
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tribution of luminous matter compared to that of the underlying dark matter, 
in particular at small scales: such ‘galaxy biasing’ can be probed with the 
techniques reviewed in this work. 

Inevitably, we had to make some decisions in the choice of topics to be covered. 
Our presentation is definitely focused on the density held, with much less 
coverage on peculiar velocities. This choice is in particular motivated by the 
comparatively still preliminary stage of cosmic velocity fields, at least from 
an observational point of view (see however [607,160] for a review). On the 
other hand, note that since velocity held results are often obtained by identical 
techniques to those used for the density held, we mention some of these results 
but without giving them their due importance. 

In order to fully characterize the density held, we choose to follow the tradi¬ 
tional approach of using statistical methods, in particular, 77-point correlation 
functions [508]. Alternative methods include morphological descriptors such 
as Minkowski functionals (of which the genus is perhaps the most widely 
known), percolation analysis, etc. Unlike correlation functions, however, these 
other statistics are not as directly linked to dynamics as correlation functions, 
and thus are not as easy to predict from theoretical models. Furthermore, ap¬ 
plications of PT to make predictions of these quantities is still in its infancy 
(see e.g [441] and references therein for recent work). 

Given that PT is an approximate method to solve the dynamics of gravita¬ 
tional clustering, it is desirable to test the validity of the results with other 
techniques. In particular, we resort to numerical simulations, which involve 
different approximations in solving the equations of motion that are not re¬ 
stricted to the weakly non-linear regime. There is a strong and healthy in¬ 
terplay between PT and IV-body simulations which we extensively illustrate 
throughout this review. At large scales PT can be used to test quantitatively 
for spurious effects in numerical simulations (e.g. finite volume effects, tran¬ 
sients from initial conditions), whereas at smaller, non-linear scales IV-body 
simulations can be used to investigate the regime of validity of PT predictions. 
Although reviewing the current understanding of clustering at small scales is 
beyond the scope of this review, we have also included a discussion of the 
predictions of non-linear clustering amplitudes because connections between 
PT and strongly non-linear behavior have been suggested in the literature. 
We also include a discussion about stable clustering at small scales which, 
when coupled with self-similarity, leads to a connection between the large and 
small-scale scaling behavior of correlations functions. 

This review is structured so that different chapters can be read independently, 
although there are inevitable relations. Chapter 2 deals with the basic equa¬ 
tions of motion and their solution in PT, including a brief summary of nu¬ 
merical simulations. Chapter 3 is a review of the basics of statistics; we have 
made it as succinct as possible to swiftly introduce the reader to the core of 
the review. For a more in-depth treatment we refer the reader to [609,61]. 
The next two chapters represent the main theoretical results; Chapter 4 deals 



with TV-point functions, whereas Chapter 5 reviews results for the smoothed 
one-point moments and PDF’s. These two chapters heavily rely on material 
covered in Chapters 2 and 3. 

In Chapter 6 we describe in detail the standard theory of estimators and er¬ 
rors for application to galaxy surveys, with particular attention to the issue 
of cosmic bias and errors of estimators of the two-point correlation function, 
power spectrum, and higher-order moments such as the skewness. Chapter 7 
deals with theoretical issues related to surveys, such as redshift distortions, 
projection effects, galaxy biasing and weak gravitational lensing. Chapter 8 
presents the current observational status of galaxy clustering, including fu¬ 
ture prospects in upcoming surveys, with particular emphasis on higher-order 
statistics. Chapter 9 contains our conclusions and outlook. A number of ap¬ 
pendices extend the material in the main text for those interested in carrying 
out detailed calculations. Finally, to help the reader, Tables 1-4 list the main 
abbreviations and notations used for various cosmological variables, fields and 
statistics. 
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Table 1 
Abbreviations 


PT 

2LPT 

EPT 

HEPT 

ZA 

SC 

CDM 

SCDM 

ACDM 

PDF 

CPDF 


Perturbation Theory; 

Second Order Lagrangian Perturbation Theory; 
Extended Perturbation Theory; 

HyperExtended Perturbation Theory; 
Zel’dovich Approximation; 

Spherical Collapse; 

Cold Dark Matter (model); 

Standard CDM model; 

Flat CDM model with a cosmological constant; 
Probability Distribution Function; 

Count Probability Distribution Function. 


Table 2 

Notation for Various Cosmological Variables 



The total matter density in units of critical density; 

Da 

The reduced cosmological constant; 

^tot 

The total energy density of the universe in units of critical density, i\ 0 t = 
D m + Da; 

H 

The Hubble constant; 

h 

The Hubble constant at present time, in units of 100 km/s/Mpc, h = 

tfo/100; 

a 

The scale factor; 

T 

The conformal time, dr = dt/a ; 

n 

The conformal expansion rate, 7i = aH ; 

D\ 

The linear growth factor; 

Dn 

The tv th order growth factor; 

m m ,n a) 

The logarithmic derivative of (the fastest growing mode of) the linear 
growth factor with respect to a: /(= din Di/dlna. 
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Table 3 

Notation for the Cosmic Fields 


X 

The Fourier transform of field X ; 

X(k) = (27r)~ 3 / d 3 xe _lkx X(x) (except in Sect. 6.5) 

X 

The comoving position in real space; 

p(x) 

The local cosmic density; 

<*(x) 

The local density contrast, 6 = p/~p — 1; 

T(x) 

The gravitational potential; 

u(x) 

The local peculiar velocity field; 

fl(x) 

The local velocity divergence in units of 7i = aH ; 

PpO^\ i ■ ■ ■ i kp) 

The p th order density field kernel; 

G*p(ki,..., k p ) 

The p th order velocity divergence field kernel; 

V’(q) 

The Lagrangian displacement field; 

J( q) 

The Jacobian of the Lagrangian-Eulerian mapping. 


Table 4 

Notation for Statistical Quantities 


m 

The density power spectrum; 


a (A:) 

The dimensionless power, A = Ank 3 P(k); 


B(ki,k 2 ,k 3 ) 

The bispectrum; 


Pn( ki, • • • , kAr) 

The IV-point polyspectrum; 


Pn 

The count-in-cell probability distribution function; 


p(6)dS 

The cosmic density probability distribution function; 


Pk 

The factorial moment of order k ; 


6( X 1,X 2 ) = Q'2 = C 

The two-point correlation function, ^ 2 ( x i ■ x 2 ) = 

(<5(xi)<5(x 2 )) c ; 

(<5(x i)$(x 2 )) = 


The cell-average two— point correlation function; 



The value of the (linearly extrapolated) o in a sphere of 8 h 1 Mpc 
radius; 

r 

Shape parameter of the linear power-spectrum, T ~ tt 
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Table 4 (continued) 


£iv(xi,... ,xat) 
wn{0i, ... ,9jsr) 

uJ N 

S P 

S3 , 5 4 

Sp 

Q = Q3, Q = Q3 
Qn, Qn 

qN, qN 

T P 

C pq 

v(y) 

Vp, /ip 

Qsir) = Gs(t), Ge(r) = GJ)(t) 

(X) 

X 

T{X)dX 

AX 


The IV-point correlation functions £jv(xi, ..., xjv) = (<5(xi)... 5(x.n))c ; 

The angular IV-point correlation functions; 

The cell-averaged IV—point correlation functions £ N = (<5j^) c ; 

The cell-averaged angular N —point correlation functions; 

The density normalized cumulants, S p = c/ {& 2 r) p ~ 1 = C P /£ P ' i 

The (reduced) skewness/kurtosis; 

The projected density normalized cumulants; 

The three-point hierarchical amplitude in real/Fourier space; 

The IV-point hierarchical amplitude in real/Fourier space; Qn can also 
stand for Sn/N n ~ 2 (Chap. 6); 

The projected IV-point hierarchical amplitude in real/Fourier space; qN 
can also stand for sn/N n ~ 2 (Chap. 6); 

The velocity divergence normalized cumulants; 

The two-point density normalized cumulants, 

c P q = m)c/(ti 1 2(6 2 ) p+q - 2 y, 

The one-point cumulant generating function, ip(y) = J2 p S P (—y) p /p\', 
The density/velocity field vertices; 

The vertex generating function for the density/velocity field, Gs(t) = 
Y, p >iV p (-t) p /p'-, and Geij) = -/(fl m , fl A ) E P >i 
The ensemble average of statistic X ; 

The estimator of statistic X ; 

The cosmic distribution function of estimator X ; 

The cosmic error on estimator X. 
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2 Dynamics of Gravitational Instability 


The most natural explanation for the large-scale structures seen in galaxy 
surveys (e.g. superclusters, walls, and filaments) is that they are the result of 
gravitational amplification of small primordial fluctuations due to the grav¬ 
itational interaction of collisionless cold dark matter (CDM) particles in an 
expanding universe [509,75,173,174], Throughout this review we will assume 
this framework and discuss how PT can be used to understand the physics of 
gravitational instability and test this hypothesis against observations. 
Although the nature of dark matter has not yet been identified, all candi¬ 
dates for CDM particles are extremely light compared to the mass scale of 
typical galaxies, with expected number densities of at least 10 50 particles per 
Mpc 2 3 [383], In this limit where the number of particles N 1, discrete¬ 
ness effects such as two-body relaxation (important e.g. in globular clusters) 
are negligible, and collisionless dark matter] ^ obeys the Vlasov equation for 
the distribution function in phase space, Eq. (12) below. This is the master 
equation from which all subsequent calculations of gravitational instability are 
derived. 

Since CDM particles are non-relativistic, at scales much smaller than the Hub¬ 
ble radius the equations of motion reduce essentially to those of Newtonian 
gravity^. The expansion of the universe simply calls for a redefinition of the 
variable used to describe the position and momentum of particles, and a redefi¬ 
nition of the gravitational potential. For a detailed discussion of the Newtonian 
limit from general relativity see e.g. [508]. We will simply motivate the results 
without giving a derivation. 


2.1 The Vlasov Equation 


Let’s consider a set of particles of a mass m that interact only gravitationally 
in an expanding universe. The equation of motion for a particle of velocity v 
at position r is thus, 


dv 
d t 



( 1 ) 


where the summation is made over all other particles at position r,. 


2 There has been recently a renewed interest in studying collisional dark mat¬ 
ter [600,700,170], which may help solve some problems with collisionless CDM at 
small scales, of order few kpc. 

3 A detailed treatment of relativistic linear perturbation theory of gravitational 
instability can be found in [19,466,400]. 
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In the limit of a large number of particles, this equation can be rewritten in 
terms of a smooth gravitational potential due to the particle distribution, 

dv dcj) 

dt dr 


where (j) is the Newtonian potential induced by the local mass density p( r), 


0(r) = G 


dV 


p(r') 


(3) 


In the context of gravitational instabilities in an expanding universe we have to 
consider the departures from the homogeneous Hubble expansion. Positions of 
particles are described by their comoving coordinates x such that the physical 
coordinates are r = a(r) x where a is the cosmological scale factor. We choose 
to describe the equations of motion in terms of the conformal time r related to 
cosmic time by dt = a(r)dr. The equations of motion that follow are valid in 
an arbitrary homogeneous and isotropic background Universe, which evolves 
according to Friedmann equations: 


&H{t) 

dr 


H m , (j ~) 
2 


H\t) + 




(4) 


(H to t(r) - = k, 


(5) 


where 7 i. = din a/dr = Ha is the conformal expansion rate, H is the Hub¬ 
ble constant, is the ratio of matter density to critical density, A is the 
cosmological constant and k = —1,0,1 for fi to t < 1, fitot = 1 and fi to t > 1 
respectively (fi to t = + Ha). Note that fi m , and Ha are time dependent. 

We then define the density contrast <5(x) by, 

p( x , t) = p(r) [1 + <5(x, r)], (6) 


the peculiar velocity u with 
v(x, r) = 7dx + u(x, r), 


(7) 


and the cosmological gravitational potential <F with 

0 ( x , t ) = -\^ x2 + $ ( x , r ), ( 8 ) 

so that the latter is sourced only by density fluctuations, as expected; indeed 
the Poisson equation reads, 

V 2< P(x, r) = ^H m (r) H 2 (t) <S(x,t). (9) 
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In the following we will only use comoving coordinates as the spatial variable 
so that all space derivatives should be understood as done with respect to x. 
The equation of motion Eq. (2) then reads 


dp 

dr 


—omV$(x) 


( 10 ) 


with 


p = a m u. 


(H) 


Let us now define the particle number density in phase space by /(x, p, r); 
phase-space conservation implies the Vlasov equation, 


d f df p df 

— = — H-V/ -amV$ • = 0 

dr or m a op 


( 12 ) 


Needless to say, this equation is very difficult to solve, being a non-linear par¬ 
tial differential equation involving seven variables. The non-linearity is induced 
by the fact that the potential <L depends through Poisson equation on the in¬ 
tegral of the distribution function over momentum (which gives the density 
held, see Eq. (13) below). 


2.2 Eulerian Dynamics 


In practice however we are usually not interested in solving the full phase- 
space dynamics, but rather the evolution of the spatial distribution. This can 
be conveniently obtained by taking momentum moments of the distribution 
function. The zeroth order moment simply relates the phase space density to 
the local mass density held, 

/d 3 p /(x,p,r) = p(x,r). (13) 

The next order moments, 

/ d 3 p — /(x, p, r) = p(x, r)u(x, r) (14) 

J am 

/ PiP ■ 

d 3 P /(x,p,r) = p(x,r)u i (x,r)u i (x,r) + ^-(x,r), (15) 

a z rrr 

define the peculiar velocity flow u(x, r) and the stress tensor ay, (x, r). The 
equation for these helds follow from taking moments of the Vlasov equation. 
The zeroth moment gives the continuity equation, 

r) + v . { + § ( x> t)] u(x, t) } = 0, (16) 
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which describes conservation of mass. Taking the first moment of Eq. (12) 
and subtracting u(x, r) times the continuity equation we obtain the Euler 
equation, 

d U ^.’ ^ + H(t) u(x, r) + u(x, r) ■ Vu(x, r) = 

-V$(x,r) - (17) 

which describes conservation of momentum. Note that the continuity equation 
couples the zeroth (p) to the first moment (u) of the distribution function, the 
Euler equation couples the first moment (u) to the second moment (cr if), and 
so on. However, having integrated out the phase-space information, we are 
here in a more familiar ground, and we have reasonable phenomenological 
models to close the hierarchy by postulating an ansatz for the stress tensor 
(Jij , i.e. the equation of state of the cosmological fluid. For example, standard 
fluid dynamics [392] gives = —pS t] + p(V iUj + V/iq — V • u) + V • u, 
where p denotes the pressure and rj and ( are viscosity coefficients. 

The equation of state basically relics on the assumption that cosmological 
structure formation is driven by matter with negligible velocity dispersion or 
pressure, as for example cold dark matter (CDM). Note that from its defini¬ 
tion, Eq. (15), the stress tensor characterizes the deviation of particle motions 
from a single coherent flow (single stream), for which the first term will be the 
dominant contribution. Therefore, it is a good approximation to set a tJ « 0, 
at least in the first stages of gravitational instability when structures did not 
have time to collapse and virialize. As time goes on, this approximation will 
break down at progressively larger scales, but we will see that at present times 
at the scales relevant to large-scale structure, a great deal can be explored and 
understood using this simple approximation. In particular, the breakdown of 
Cij ~ 0 describes the generation of velocity dispersion (or even anisotropic 
pressure) clue to multiple streams, generically known as shell crossing. We will 
discuss this issue further below. 

We now turn to a systematic investigation of the solutions of Eqs. (9,16,17) 
for vanishing stress tensor. 


2.3 Eulerian Linear Perturbation Theory 


At large scales, where we expect the Universe to become smooth, the fluctu¬ 
ation fields in Eqs. (6-8) can be assumed to be small compared to the homo¬ 
geneous contribution described by the first terms. Therefore, it follows that 
we can linearize Eqs. (9,16,17) to obtain the equations of motion in the linear 
regime 


cM(x, r) 
dr 


+ 0(x,t) = 0, 


(18) 
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d U l X ’ — + H(t) u(x, t) = -V$(x, r), 


(19) 


where 0(x, r) = V • u(x, r) is the divergence of the velocity held. These 
equations are now straightforward to solve. The velocity held, as any vector 
held, can be completely described by its divergence 0(x, r) and its vorticity 
w(x, r) = V x u(x, r) , whose equations of motion follow from Eq. (19) 


+ W ( T ) S(x,t) + ^n m (T)H 2 (T)S (x,r) =0, 


( 20 ) 


3w(x, r) 

97 


+ H(r) w(x, r) = 0. 


( 21 ) 


The vorticity evolution readily follows from Eq. (21), w(r) oc a^ 1 , i.e. in the 
linear regime any initial vorticity decays away due to the expansion of the 
Universe. The density contrast evolution follows by taking the time derivative 
of Eq. (20) and replacing in Eq. (18), 

d2/ M r ) , cLD i( r ) 3 0 2 

^2 + 'HU — 2 m l r )^ (22) 


where we wrote <5(x, r) = T> 1 (t)5(x, 0), with T ) i(r) the linear growth factor. 
This equation, together with the Friedmann equations, Eqs. (4-5), determines 
the growth of density perturbations in the linear regime as a function of cos¬ 
mology. Since it is a second-order differential equation, it has two independent 
solutions, let’s denote the fastest growing mode d\ +} (t) and the slowest one 
d[ \t). The evolution of the density is then 

<5(x, r) = d[ + \t)A(x) + (23) 


where kL(x) and B(x) are two arbitrary functions of position describing the ini¬ 
tial density held configuration, whereas the velocity divergence [using Eq. (18)] 
is given by 


0(x, t) = -'H(t) [/(Q m , Ha)^(x) + p(H m , Q a )B(x)] , 


(24) 


„>_dlnB! +) ldlnBi +) _, n ldlnD'-' , or , 

a) — 1 , ~ 0-1 1 a) “aw 1 • (25) 

d In a TL dr n dr 

The most important cases are 

(1) When = 1, Ua = 0, we have the simple solution 

Z^ +) = a, £>i -) = a -3 / 2 , /(1,0) = 1, 

thus density fluctuations grow as the scale factor. 


( 26 ) 
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(2) When fi m < 1, Ha — 0 we have (x = 1 /fi m — 1) [504] 


D i +) = 1 + - + 3 a 

x 


'1 + x 


x° 


In 


vTT 


X 



,( 27 ) 


and the logarithmic derivative can be approximated by [506] 

/(«,„, 0)«S2?,f. (28) 

As £l m —» 0 (x 3> 1), —■ 1 and D[ 1 —> a; 1 and perturbations 

cease to grow. 

(3) In the case where there is only matter and vacuum energy, the linear 
growth factor admits the integral representation [305] as a function of 
H m and 12 a 


d[ +) 



a 


0 


da 

a 3 H(a) ’ 


(29) 


where H(a ) = ^/Ll m a 3 + (1 — W — Ha)o 2 + Ha- In general, it is not 

possible to solve analytically for D[ + ^ (unlike D\ \ see [305]), but can be 
approximated by [390,114] 


d[ +) 



d^lm 

Ha + (1 + H m /2)(1 + Ha/70) 


D 


(-) 

i 


H 


/(n m ,n A ) ~ 

where fi 9 = Ha (a 


1 

[ 1 -(H 0 + H° -l)a + H°a 3 ] a6 ’ 

= 1). When fi m + Ha = 1, we have 


(30) 

(31) 

(32) 


/(fi m ,l-fi m )^fi^ 9 . (33) 

Due to Eq. (31) and Eq. (4), g(fi m , Ha) = fi m — Ha/2 — 1 holds for arbitrary 
fi m and Ha* 


2-4 Eulerian Non-Linear Perturbation Theory 

We will now consider the evolution of density and velocity fields beyond the 
linear approximation. To do so, we shall first make a self-consistent approxi¬ 
mation, that is, we will characterize the velocity field by its divergence, and 
neglect the vorticity degrees of freedom. This can be justified as follows. From 
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Eq. (17) we can write the vorticity equation of motion 

— ^ —- + H(t) w(x, r) — V x [u(x, r) x w(x, r)] = V x ^-V ' >(34) 

where we have temporarily restored the stress tensor contribution (ay,) to 
the conservation of momentum. We see that if oy,- cs 0, as in the case of a 
pressureless perfect fluid, if the primordial vorticity vanishes, it remains zero 
at all times. On the other hand, if the initial vorticity is non-zero, we saw in the 
previous section that in the linear regime vorticity decays due to the expansion 
of the Universe; however, it can be amplified non-linearly through the third 
term in Eq. (34). In what follows, we shall assume that the initial vorticity 
vanishes, thus Eq. (34) together with the equation of state a tJ m 0 guarantees 
that vorticity remains zero throughout the evolution. We must note, however, 
that this assumption is self-consistent only as long as the condition oy,- « 0 
remains valid; in particular, multi-streaming and shocks can generate vorticity 
(see for instance [521]). This is indeed expected to happen at small enough 
scales. We will come back to this point in order to interpret the breakdown of 
perturbation theory at small scales. 

The assumption of perturbation theory is that it is possible to expand the 
density and velocity fields about the linear solutions, effectively treating the 
variance of the linear fluctuations as a small parameter (and assuming no 
vorticity in the velocity held). Linear solutions correspond to simple (time 
dependent) scalings of the initial density held; thus we can write 

OO OO 

= X^ (n) 0M), 0 (m) = Y, 0(n) OM), ( 35 ) 

71—1 71 — 1 

where th 1 ) and 6 ^ are linear in the initial density held, 5^ and 6 ® are 
quadratic in the initial density held, etc. 

2-4-1 The Equations of Motion in the Fourier Representation 
At large scales, when fluctuations are small, linear perturbation theory pro¬ 
vides an adequate description of cosmological helds. In this regime, differ¬ 
ent Fourier modes evolve independently conserving the primordial statistics. 
Therefore, it is natural to Fourier transform Eqs. (9,16,17) and work in Fourier 
space. Our convention for the Fourier transform of a held A(x, r) is: 

t) = J exp(— ?'k • x) A(x, r). (36) 

When non-linear terms in the perturbation series are taken into account, 
the equations of motion in Fourier space show the coupling between differ¬ 
ent Fourier modes characteristic of non-linear theories. Taking the divergence 
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of Equation (17) and Fourier transforming the resulting equations of motion 
we get: 

T ^ + 9( k, r) = - J d 3 kid 3 k 2 <5D(k - k 12 )a(k 1 . k 2 )0(ki, r)<5(k 2 , r),(37) 

T ~ + ^( r ) ^( k > T ) + ^D m 7t 2 (r)<5(k, t) — J d 3 kid 3 k 2 h D (k - k i2 ) 

x/3(ki, k 2 )d(ki, r)0(k 2 , r), (38) 

(5b denotes the three-dimensional Dirac delta distribution) where the func¬ 
tions 


a(ki,k 2 ) 



/3(ki, k 2 ) 


fef 2 (kr • k 2 ) 
2kfk% 


(39) 


encode the non-linearity of the evolution (mode coupling) and come from the 
non-linear terms in the continuity equation (16) and the Euler equation (17) 
respectively. From equations (37)-(38) we see that the evolution of <5(k, r) and 
d(k, r) is determined by the mode coupling of the fields at all pairs of wave- 
vectors kx and k 2 whose sum is k, as required by translation invariance in a 
spatially homogeneous Universe. 


2-4-2 General Solutions in Einstein-de Sitter Cosmology 
Let’s first consider an Einstein-de Sitter Universe, for which il m = 1 and 
Da = 0. In this case the Friedmann equation, Eq. (4), implies a(r) oc r 2 , 
7 ~L(r) = 2/r, and scaling out an overall factor of 7 i. from the velocity held 
brings Eqs. (37-38) into homogeneous form in r or, equivalently, in a(r). As 
a consequence, these equations can formally be solved with the following per¬ 
turbative expansion [270,334,428], 

OO OO 

<5(k, r) = a n i,r)5 n (k), 9( k, r) = -H(t) a n (r)9 n (k), (40) 

n =1 n=1 


where only the fastest growing mode is taken into account. Remarkably it 
implies that the PT expansions defined in Eq. (35) are actually expansions 
with respect to the linear density held with time independent coefficients. 
At small a the series are dominated by their first term, and since 6h(k) = 
<5i(k) from the continuity equation, <5i(k) completely characterizes the linear 
fluctuations. 

The equations of motion, Eqs. (37-38) determine <5„(k) and 9 n ( k) in terms of 
the linear fluctuations to be: 

<5„(k) = j d 3 qi ... J d 3 q„ <5 D (k - qi...„)i r n (qi,..., q„)<5i(qi)... <5i(q„),(41) 
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O n { k) = J d 3 qi ■ ■ ■ J d 3 q n 5 D (k - qi... n )G n (qi,... ,q n )5i(qi) .. .5i(q„),(42) 

where F n and G n are homogeneous functions of the wave vectors {qi,..., q„} 
with degree zero. They are constructed from the fundamental mode coupling 
functions o'(k 1 ,k 2 ) and /3(ki,k 2 ) according to the recursion relations (n > 2, 
see [270,334] for a derivation): 


^n(qi, 


n —1 

, q«) = Y 

m= 1 


• • • 5 Qm) 

(2n + 3)(n- 1) 


(2n + l)a(ki, k 2 )F n _ m (q m+ i,..., q n ) 


+2/3(ki, k 2 )G n _ m (q m+1 ,..., q„) , 


(43) 


G n ( qi, 


71—1 

•, qn) = Y 


m =1 


^*771 (Qi? • • • 5 0.771 ) 

(2n + 3)(n- 1) 


3a(k 1 ,k 2 )F n 

—771 (q»n+i j ■ ■ ■ j q n) 


T2n/7(ki, k 2 )G n _ m (q m+ i,..., qn) > 


(44) 


(where k : = qi + .. . + q m , k 2 = q m+ i + .. -+q„, k = ki+k 2 , and F 1 = G 1 = 1) 
For n — 2 we have: 


^(qi,q 2 ) 


^(qi, q 2 ) 


5 lqi • q 2 ( qi q 2 . 2 (q x ■ q 2 ) 2 

7 2 q x q 2 q 2 qi 7 gfgj 

3 l qi • q 2 / qi (fe, 4 (qi ; q 2 ) 2 

7 2 qiq 2 q 2 Q\ 7 qfq£ 


(45) 

(46) 


Explicit expressions for the kernels F :i and F 4 are given in [270]. Note that the 
symmetrized kernels, F < n s ' 1 (obtained by a summation of F n with all possible 
permutations of the variables), have the following properties [270,692]: 

(1) As k = q x + ... + q„ goes to zero, but the individual q* do not, F^ oc 
k 2 . This is a consequence of momentum conservation in center of mass 
coordinates. 

(2) As some of the arguments of F^ get large but the total sum k = qi + 
... + q n stays fixed, the kernels vanish in inverse square law. That is, for 
p 3> q^ we have: 

Fi s \ qi, • • •, q„- 2 , P, -p) oc k 2 /p 2 , (47) 

and similarly for G^\ 

(3) If one of the arguments q* of F r [ s> or Gffl goes to zero, there is an infrared 
divergence of the form q i/q 2 . This comes from the infrared behavior of the 
mode coupling functions (^(kx, k 2 ) and /?(ki, k 2 ). There are no infrared 
divergences as partial sums of several wavevectors go to zero. 

A simple application of the recursion relations is to derive the corresponding 
recursion relation for vertices u n and fi n which correspond to the spherical 
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average of the PT kernels: 


£ 

III 

^ p 


• 5 k n )| 

(48) 

r dfii 

^ = n-y 4?r . 

47T 

• • j k n ). 

(49) 


Since the kernels F n and G n depend only on the ratios kj/kj, the vertices 
depend a priori on these quantities as well. Considering the equations (43, 
44), one can see that the angle integrations can be done recursively: it is 
possible to integrate first on the angle between the vectors ki = qi + ... + q, n 
and k 2 = q m +i + • • • + q n , which amounts to replace a(kx, k 2 ) and /3(k x , k 2 ) 
by their angular averages a — l and j3 — 1/3. As a result we have, 


n— 1 


= E 


n 


l^r, 


\ m ) (^ n + 3)(n — 1) L 


(2'U 4“ 1 )Vn—m 4“ g/^ra —m 


n—1 


k'n — 


n 


hr: 


3 l^n—m 4“ gU-hn—r 


^ \mj (2 n + 3)(n - 1) L 
and the vertices are thus pure numbers, e.g.: 


34 682 26 142 

_ 1 ; „ 2 _ - ; „ 3 _ — ; ,, 2 _ ; ,<3 _ — 


(50) 

(51) 


(52) 


This recursion relation plays a central role for the derivation of many results 
in PT [43], 

In particular, it can be shown that it is directly related to the spherical collapse 
dynamics [43,222], In this case the initial density held is such that it has a 
spherical symmetry around x = 0. As a consequence the Fourier transform of 
the linear density held <h(k) depends only on the norm of k, and this property 
remains valid at any stage of the dynamics. Then the central density for such 
initial conditions, S sc , can be written (assuming Q m = 1 for definiteness) 


M°0 =J2 aU J d 3 qi -J d 3 q„F n (qi,...,q n )(5i(|qi|)...5i(|q n |). (53) 


Performing first the integration over the angles of the wave vectors, one re¬ 
covers, 


= y j ^a n e n 

V n! 


(54) 


with e = / d 3 q<5i(|q|). Similarly the central velocity divergence for the spher¬ 
ical collapse is expanded in terms of the /x n parameters. The angular averages 
of the PT kernels are thus directly related to the spherical collapse dynamics. 
This result is valid for any cosmological model. 
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2-4-3 Cosmology Dependence of Non-Linear Growth Factors 
In general the PT expansion is more complicated because the solutions at each 
order become non-separablc functions of r and k [91,93,46,118]. In particular 
the growing mode at order n does not scale as D™(r) (or a n (r) as in Eq. (40)). 
However, using the recursion relations, we can easily find the full dependence 
on cosmological parameters for the vertices, that is, the dependence that arises 
in the spherical collapse approximation. The PT kernels can then be con¬ 
structed order by order in terms of these solutions [46]. In the spherical model, 
we can write 




n=l 


ni 


(55) 


0(t) = -«(T)/(si,„,n A ) f; fN d p l(T ) £ ]» 

n=1 n ' 


(56) 


From the Fourier space equations of motion, Eqs. (37-38), and taking into 
account that the spherical averages of a and f3 can be taken at once, one gets, 


du n 

dto^Dl + nV " “ 



(57) 


dfi n ( 3 Qm \ 3 Qm 

d log Di np " \W ~ 7 ^ “ W"’ 



(58) 


noting that d log D\ = 7d/dr. This hierarchy of differential equations must 
then be solved numerically at each order. The results for n = 2,3 show that 
indeed the dependence of the vertices on cosmological parameters is a few 
percent effect at most [46,223]. 

For a perfect fluid with a equation of state p = pp we have [259] 


2 (17 + 48 ?7 + 27 rj 2 ) 

3 (l + r/) (7+15 ?/) ‘ 


(59) 


for an Einstcin-de Sitter Universe. Of course, this reduces to Eq. (52) as p —> 0. 
For the Brans-Dicke Cosmology [98], with a coupling eo to gravity: 


Uoj + 56 
21^ + 36’ 


(60) 


which reduces to the standard result u 2 = 34/21 in the limit u —> oo (see [259] 
for details and results for z/ 4 ). Even in these extreme cosmologies, the possible 
variations of u 2 are quite small given the observational constraints on rj and 
u [259], 
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2-4-4 Approximate Solutions in Arbitrary Cosmology 

This quite remarkable result is asking for an explanation. It is indeed possible 
to show that a simple approximation to the equations of motion for general 
H m and Ha leads to separable solutions to arbitrary order in PT and the same 
recursion relations as in the Einstein-de Sitter case [560]. All the information 
on the dependence of the PT solutions on the cosmological parameters H m 
and Q 4 is then encoded in the linear growth factor, -Di(t). 

In linear PT, the growing-mode solution to the equations of motion (37) and 
(38) reads 

5(k,r)=£> 1 (r)(5 1 (k), (61) 

0{ k, t) = - H{r)f(Q m , fl A )£>i(r)<5i(k), (62) 

where Di(r ) is linear growing mode. As mentioned before, we look for sepa¬ 
rable solutions of the form (compare with Eq. (40) ) 


5(k,T) = 53A>(T)5 B (k), 

77.—1 

OO 

9(k,r) = n m ,n A ) y E„(t) 9 n (k), 


(63) 

(64) 


77=1 


From the equations of motion (37) and (38) we get for the n th order solutions, 
d/7,. 


11 - En ® n = f d 3 k 1 d 3 k 2 8 D (k - ki 2 )a(k, ki) 

d log D i J 

77—1 

X Y. D 77—777-^777 ^777 (kr)^ —777 (k 2 ), 


(65) 


777=1 


d E„ 


d log D 


-&n + ( 


3 H r: 
~2P 


- i ia,o,, - 


3 n m 

2 f 2 

77 — 1 


D n S n = 


d 3 k 1 d 3 k 2 fe(k - k 12 )/3(k, k^ k 2 ) ^ E , n _ m £ , m 0 m (k 1 )0 n _ m (k 2 


( 66 ) 


777=1 


By simple inspection, we see that if a) = then the system of 

equations becomes indeed separable, with D n = E n = (D i) n . In fact, the 
recursion relations then reduce to the standard Q m = 1, Ha = 0 case, shown 
in equations (43) and (44). Then H m // 2 = 1 leads to separability of the PT 
solutions to any order, generalizing what has been noted before in the case of 
second order PT [432], From Section 2.3, the approximation /(H m , Ha) ~ fi^ 2 
is actually very good in practice. As a result, for example, as we review in 
the next section, the exact solution for the Ha = 0 case gives Z7 2 /(Z7 i) 2 = 
1 + 3/17(H“ 2 / 63 — 1), extremely insensitive to fi m , even more than what the 
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approximation /( Q m , fbv) = ~ would suggest, since for most of the 

time evolution Q m and Da are close to their Einstein-de Sitter values. 


2-4-5 The Density and Velocity Fields up to Third Order 
The computations of the local density field can be done order by order for 
any cosmological model. We give here their explicit expression up to third 
order. The detailed calculations can be found in [46]. Different approaches 
have been used in the literature to do such calculations [105,118,93]. The 
direct calculation appears to be the most secure, if not the rapid or most 
instructive. 

The time dependence of the solutions can be written as a function of D 1 (r), 
z/ 2 (t), /aj(t) and an auxiliary function A 3 (r) which satisfies, 

£!) +W dh^!)_ ! W ^ ra A 3 D \ = \ H ^ mD l (67) 

with A 3 ~ 9/10 when r —a 0. The geometrical dependences can all be expressed 
in terms of the two functions cx(qi,qj) [see Eq. (39)] and 

1 fo Q 

7 (q*> q j) = ~ [«(qo q?) + <*(qj> q*)] - P( q», q?) = i - , ( 68 ) 

z \Qi Qj) 

which for short will be denoted a t J and 7 ^ respectively. Then we have, 



^(qi,q 2 ) 

G' 2 (qi,q2) 


F2 


71,2 + « 1,2 


-/(D m ,D A ) 




7 l ,2 + a l,2 


(69) 

(70) 


for the second-order solutions. Their symmetrized parts can be shown to take 
the form (see Section 2.7), 


^ 2 (s) (qi, 02) — 2(1 + £ ) + 2 


iqi • q2/?i , ?2 


- + - + =-- 


qiqi q2 q 1 


(qi • q2 ) 5 
qlql 


(71) 


G , 2 S) (qi,q2) 


£ lq 1 ^2£l + 9 E 

2 qpl2 <?2 q\ 


e ) 


(qi • 02) 
qlql 


2 


(72) 


where e ~ (3/7)D m 2 / 63 for Q m > 0.1 [93]. At third order the kernel reads, 


* 3 (qi, q 2 , q;?) — 7Zi + o 2 iz 2 + n 3 tz 3 + a 3 77 . 4 , (73) 

where, using the simplified notation a l] .k = a(qi+qj, q*), o.i,jk = o;(q j, qj+q*,) 
and similar definitions for 7 ^ and 7 ^, we have 


77i = 


1 1 1 A 

-«3,12 + -«12,3 — 7773,12 I 


07,2 + 


3 4 5 \ 

77*^12,3 — 7T a 3,12 + 7773,12 ) 


71 , 2 , 
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(74) 

2 

7^-2 = ^ (0:3,12 + aq 2 ,3 — 37 3 j i 2 ) 7i j2 , 

(75) 

^ 3 

Ns — -73,12 7i,2, 

(76) 

2 (l 1 \ 

(77) 

n 4 = -73 ,12 0:1,2 — f 20 : 3i i 2 + - 73,12 J 7i,2- 


These results exhibit the explicit time and geometrical dependence of the 
density field up to third order (a similar expression can be found for G 3 , 
see [46]). In Chapter 5 we examine the consequences of these results for the 
statistical properties of the cosmic fields. 

2-4-6 Non-Linear Growing and Decaying Modes 

Perturbation theory describes the non-linear dynamics as a collection of lin¬ 
ear waves, <5i(k), interacting through the mode-coupling functions a and (3 
in Eq. (39). Even if the initial conditions are set in the growing mode, after 
scattering due to non-linear interactions waves do not remain purely in the 
growing mode. In the standard treatment, described above, the sub-dominant 
time-dependencies that necessarily appear due to this process have been ne¬ 
glected, i.e., only the fastest growing mode (proportional to D”) is taken into 
account at each order n in PT. Here we discuss how one can generalize the 
standard results to include the full time dependence of the solutions at every 
order in PT [561,569]. This is necessary, for example, to properly address the 
problem of transients in IV-body simulations in which initial conditions are set 
up using the Zel’dovich Approximation (see Section 2.5). This is reviewed in 
Section 5.7. In addition, the approach presented here can be useful to address 
evolution from non-Gaussian initial conditions. 

The equations of motion can be rewritten in a more symmetric form by defin¬ 
ing a two-component “vector” T a (k, z), where a = 1,2, z = In a (we assume 
f 1 m = 1 for definiteness), and: 

^o(k, z) = ^<5(k, z)\ —0(k, z)/7ij , (78) 

which leads to the following equations of motion (we henceforth use the con¬ 
vention that repeated Fourier arguments are integrated over) 


cbT a (k, z) + G afe T 6 (k, z) = 7 abc (k, k 1} k 2 ) 40>(k ls z) d' c (k 2 , z), (79) 

where 7 a & c is a matrix whose only non-zero elements are 7 12 i(k, k 1; k 2 ) = 
5 d (k - k x - k 2 ) cc(k, ki) and y 222 (k, k x , k 2 ) = d D (k - ki - k 2 ) /3(ki, k 2 ), and 


n 


ab — 


' 0 -r 
.-3/21/2. 


(80) 
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The somewhat complicated expressions for the PT kernels recursion relations 
in Sect. 2.4.2 can be easily derived in this formalism. The perturbative solu¬ 
tions read [see Eq. (40)] 

OO 

tfa(k,*) = E e- (81) 

n= 1 


which leads to 


n— 1 

(nSab + flab) i>b l \ k) = 7afec(k, ki, k 2 ) ^t~ m \ ki) V’c™^)- (82) 

m= 1 


Now, let cr a6 ' (n) = nh a ft + f2 ab , then we have: 


n—1 

^i n) ( k ) = o-ab(n) 76 C d(k, k 1} k 2 ) ^ ^ n_m) (k x ) ^ m) (k 2 ), (83) 

m=l 


where 


<7ab(n) 


1 

(2n + 3)(n- 1) 


2 n + 1 2 
3 2n. 


(84) 


Equation (83) is the equivalent of the recursion relations in Eqs. (43-44), for 
the n th order Fourier amplitude solutions 

To go beyond this, that is, to incorporate the transient behavior before the 
asymptotics of solutions in Eq. (81) are valid, it turns out to be convenient 
to write down the equation of motion, Eq. (79), in integral form. Laplace 
transformation in the variable z leads to: 

*k(k,w) = <fc(k) +7„4c(k,k 1 ,k 2 )y Lj 1 , t (k 1 , aJ,)'I' c (k 2 , oj — oj,), 

(85) 

where 0 a (k) denote the initial conditions, that is \l/ a (k, z — 0) = 0 a (k). Mul¬ 
tiplying by the matrix a c ,b, and performing the inversion of the Laplace trans¬ 
form gives [569] 


T a (k, 2 () =g a b(z) M k )+ / ds g a b(z - s) 7 6cd (k,k 1 ,k 2 ) T c (ki, s)T d (k 2 , s), 

° 

where the linear propagator g a b{z) is defined as (c > 1 to pick out the standard 
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retarded propagator [561]) 


c+ioo 

gab(z) = j ^<7 a& ( Uj) e WZ 

c—ioo 


e 1 2 * * * 6 

'3 2' 

e -3z/2 

'-2 2' 

5" 

.3 2. 

5 

1 

CO 

1 

CO 

_1 


( 87 ) 


for z > 0, whereas g a b(z ) = 0 for z < 0 due to causality, g a b(z ) —> 5 a b 
as z —> 0 + . The first term in Eq. (87) represents the propagation of linear 
growing mode solutions, where the second corresponds to the decaying modes 
propagation. Equation (86) can be thought as an equation for T a (k, z) in 
the presence of an “external source” 0&(k) with prescribed statistics given 
by the initial conditions^. It contains the full time dependence of non-linear 
solutions, as will be discussed in detail in Sect. 5.7. To recover the standard 
(asymptotic) time dependence one must take the initial conditions to be set 
in the growing mode, 0& oc (1,1), which vanishes upon contraction with the 
second term in Eq. (87), and reduces to the familiar linear scaling (j) a {z) = 
e"0 a (O) = a(r) 0 a (O); and, in addition, set the lower limit of integration in 
Eq. (86) to s = — oo, to place initial conditions “infinitely far away” in the 
past. 


2.5 Lagrangian Dynamics 


So far we have dealt with density and velocity fields and their equations of mo¬ 
tion. However, it is possible to develop non-linear PT in a different framework, 
the so-called Lagrangian scheme, by following the trajectories of particles or 
fluid elements [705,102,465], rather than studying the dynamics of density and 
velocity field^. In Lagrangian PT0, the object of interest is the displacement 
field \l/(q) which maps the initial particle positions q into the final Eulerian 
particle positions x, 

x(r) = q + T'(q,r). (88) 


The equation of motion for particle trajectories x(r) is then 


d 2 x 

dr 2 


H(t) 


dx 

dr 


-V4>, 


(89) 


1 This is essentially a field-theoretic description of gravitational instability, non¬ 

linear corrections can be thought as loop corrections to the propagator and the 

vertex given by the 7 a & c matrix, see [569] for details. 

5 It is also possible to study Lagrangian dynamics of density and velocity fields 

following the fluid elements, by using the convective derivative D/Dt = d/dt + u- V 
in the equations of motion, Eqs. (16-17). We will not discuss this possibility here, 
but e.g. see [62,327] 

6 For reviews of Lagrangian PT, see e.g. [107,94]. 









where <L denotes the gravitational potential, and V the gradient operator in 
Eulerian coordinates x. Taking the divergence of this equation we obtain 


J(q,r) V- 


r d 2 ^ 
- dr 2 


+ H(t) 


d£, 
dr - 


| n m H\J - 1 ), 


(90) 


where we have used Poisson equation together with the fact that the density 
field obeys p (1 + J(x))d 3 x = p d 3 g, thus 


1 + J(x) 


1 _ 1 
Det (^Sjj + r ) 


(91) 


where Tjj = d^i/dqj, and J(q, r) is the Jacobian of the transformation 
between Eulerian and Lagrangian space. Note that when there is shell crossing, 
i.e. fluid elements with different initial positions q end up at the same Eulerian 
position x through the mapping in Eq. (88), the Jacobian vanishes and the 
density field becomes singular. At these points the description of dynamics in 
terms of a mapping does not hold anymore. 

Equation (90) can be fully rewritten in terms of Lagrangian coordinates by 
using that V* = (J*j + J/jj) -1 V 9 ., where V g = d/dq denotes the gradient op¬ 
erator in Lagrangian coordinates. The resulting non-linear equation for T'(q) 
is then solved perturbatively, expanding about its linear solution. 


2.6 Linear Solutions and the Zel’dovich Approximation 
The linear solution of Eq. (90) 

V,-^ 1 ) = - J D 1 (r)<5(q), (92) 

where J(q) denotes the density field imposed by the initial conditions and 
.Di(t) is the linear growth factor, which obeys Eq. (22). We implicitly assume 
that vorticity vanishes, then Eq. (92) completely determines the displacement 
held to linear order. Linear Lagrangian solutions have the property that they 
become exact for local one-dimensional motion, i.e. when the two eigenvalues 
of the velocity gradient along the trajectory vanish [102]. Note that the evo¬ 
lution of fluid elements at this order is local , i.e. it does not depend on the 
behavior of the rest of fluid elements. 

The Zel’dovich Approximation (hereafter ZA) [705] consists in using the linear 
displacement held as an approximate solution for the dynamical equations^. 
It follows from Eq. (91) that the local density held reads, 

1 + <S(x, t) = (1 _ ^ — (t)] (1 _ — (t)] (1 _ AjBi (t)] . (93) 

' Rigorously, the ZA results from using the linear displacement field with the con¬ 
straint that at large scales one recovers linear Eulerian PT [103]. 
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where Aj are the local eigenvalues of the tidal tensor 'Iq J -. From this expression 
we can see that depending on the relative magnitude of these eigenvalues, the 
ZA leads to planar collapse (one positive eigenvalue larger than the rest), fila¬ 
mentary collapse (two positive eigenvalues larger than the third), or spherical 
collapse (all eigenvalues positive and equal). If all eigenvalues are negative, 
then the evolution corresponds to an underdense region, eventually reaching 
5 = —1. For Gaussian initial conditions, it is possible to work out the probabil¬ 
ity distribution for the eigenvalues [190], which leads through the non-linear 
transformation in Eq. (93) to a characterization of the one-point statistical 
properties of the density held. These results will be discussed in Section 5.8.3. 


2 .7 Lagrangian Perturbation Theory 


Unlike in Eulerian PT, there is no known recursive solution for the expression 
of the order by order cosmic fields in Lagrangian PT, even for the Einstein-de 
Sitter case. One reason for that is that beyond second order, even though one 
can assume an irrotational how in Eulerian space, this does not imply that 
the displacement held is irrotational [105]. It has been stressed that already 
second-order Lagrangian PT for the displacement held (hereafter 2LPT), does 
provide a remarkable improvement over the ZA in describing the global prop¬ 
erties of density and velocity fields [106,455,93] and in most practical cases the 
improvement brought by third-order Lagrangian PT is marginal [106,455]. 
One way to understand this situation is to recall that the Lagrangian pic¬ 
ture is intrinsically non-linear in the density held (e.g. see Eq. (91)), and a 
small perturbation in Lagrangian fluid element paths carries a considerable 
amount of non-linear information about the corresponding Eulerian density 
and velocity helds. In particular, as we shall see below, a truncation of La¬ 
grangian PT at a hxed order, yields non-zero Eulerian PT kernels at every 
order. However, as we shall review in the next few chapters, this is not always 
an advantage, particularly when dealing with initial conditions with enough 
small-scale power where shell crossing is signihcant. In these cases, Lagrangian 
PT generally breaks down at scales larger than Eulerian PT. 

The reason for the remarkable improvement of 2LPT over ZA is in fact not 
surprising. The solution of Eq. (90) to second order describes the correction 
to the ZA displacement due to gravitational tidal effects, that is, it takes into 
account the fact that gravitational instability is non-local. It reads 


V, ■ ^ (2) 


2 A(r)E(nV^I 


(i) 

3 



(94) 


where ^(t) denotes the second-order growth factor, which for 0.1 < fl m < 3 
(Ha = 0) obeys 


D*(t) 


3 

7 


D\{r) y 


(95) 
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or more precisely 

D 2 (t) « -jDUt) n- 2/ra , (96) 

to better than 7% and 0.5% respectively [91], whereas for flat models with 
non-zero cosmological constant Ha we have for 0.01 < fl m < 1 

Ji- 1 / 143 , (97) 

to better than 0.6% [93]. Since Lagrangian solutions up to second-order are 
curl-freeQ it is convenient to define Lagrangian potentials (j)^ and q so 
that in 2LPT 

x(q) = q * D 1 V q ^ + D 2 V q ^ 2 \ (98) 

and the velocity held then reads 

u = —D\ h H V ? 0 (1) + D 2 f 2 n V g </> (2) , (99) 

where the logarithmic derivatives of the growth factors _/) = (dlnL>j)/(dlna) 
can be approximated for open models with 0.1 < Q m < 1 by 

h ~ !2„ /5 . h ~ 2 (ill 7 , (100) 

to better than 2% [506] and 5% [93], respectively. For hat models with non-zero 
cosmological constant Ha we have for 0.01 < Q m < 1 

/i ~ Aoiiff 1 , (101) 

to better than 10% and 12%, respectively [93]. The accuracy of these two hts 
improves significantly for H m > 0.1, in the relevant range according to present 
observations. Summarizing, the time-independent potentials in Eqs. (98) and 
(99) obey the following Poisson equations [106] 

V 2 0 (1) (q) = <Kq), ( 102 ) 

v^ (2) (q) = 5}0, ( S(q) 0yj(q) - (0,p- (q)) 2 ]- (103) 

i>j 

It is possible to improve on 2LPT by going to third-order in the displace¬ 
ment held (3LPT), however it becomes more costly due to the need of solving 
three additional Poisson equations [105,117]. Third-order results give a bet¬ 
ter behavior in underdense regions [93] and lead to additional substructure in 
high-density regions [108]. Detailed comparison of Lagrangian PT at different 
orders against numerical simulations is given in [93,367]. 

8 This is assuming that initial conditions are in the growing mode, for a more 
general treatment see [104]. 
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2.8 Non-Linear Approximations 


When density fluctuations become strongly non-linear, PT breaks down and 
one has to resort to numerical simulations to study their evolution. However, 
numerical simulations provide limited physical insight into the physics of grav¬ 
itational clustering. On the other hand, many non-linear approximations to 
the equations of motion have been suggested in the literature which allow cal¬ 
culations to be extrapolated to the non-linear regime. However, as we shall 
see, it seems fair to say that these approximations have mostly been useful 
to gain understanding about different aspects of gravitational clustering while 
quantitatively none of them seem to be accurate enough for practical use. 
Rigorous PT has provided a very useful way to benchmark these different 
approximations in the weakly non-linear regime. 

In general, most non-linear approximations can be considered as different as¬ 
sumptions (valid in linear PT) that replace Poisson’s equation [470]. These 
modified dynamics, are often local, in the sense described above for the ZA, 
in order to provide a simpler way of calculating the evolution of perturbations 
than the full non-local dynamics. 

Probably the best known of non-linear approximations is the ZA, which in 
Eulerian space is equivalent to replacing the Poisson equation by the following 
ansatz [470,327] 

u(x,T) = “ 30 ^( 7 ) V4,(x ’ t) ’ (104) 

which is the relation between velocity and gravitational potential valid in 
linear PT. Conservation of momentum (assuming for definiteness Q m = 1) 
then becomes [see Eq. (17)] 

T “ - u(x, r) + u(x, r) • Vu(x, r) = 0. (105) 

OT 2 


It is straightforward to find the PT recursion relations using these equations of 
motion [557], the result for the density field kernel is particularly simple [274] 



f k • qi k • q„ 

n! qf " ' ql 


(106) 


where k = qx + ... + q n . As we mentioned before, the ZA is a local approxi¬ 
mation and becomes the exact dynamics in one-dimensional collapse. It is also 
possible to formulate local approximations that besides being exact for pla¬ 
nar collapse like the ZA, are also exact for spherical [62] and even cylindrical 
collapse [327]. However, their implementation for the calculation of statistical 
properties of density and velocity fields is not straightforward. 

A significant shortcoming of the ZA is the fact that after shell crossing (“pan¬ 
cake formation”), matter continues to flow throughout the pancake without 
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ever turning around, washing out structures at small scales. This can be fixed 
phenomenologically by adding some small effective viscosity to Eq. (105), 
which then becomes the Burgers’ equation^] 

d U ^’ ^ - -ip- u(x, t) + u(x, r) • Vu(x, r) = z/V 2 u(x, r). (107) 

This is the so-called adhesion approximation [278]. This equation has the nice 
property that for a potential flow it can be reduced to a linear diffusion equa¬ 
tion, and therefore solved exactly. Given the initial conditions, this can be used 
to predict the location of pancakes and clusters, giving good agreement when 
compared to numerical simulations [381]. More detailed comparisons with nu¬ 
merical simulations for density field statistics show an improvement over the 
ZA at small scales [683], however, at weakly non-linear scales the adhesion 
approximation is essentially equal to the ZA. 

The linear potential approximation [97,13] assumes that the gravitational po¬ 
tential remains the same as in the linear regime, therefore 

V 2 $(x,r) = ^fl m 7f 2 (r)5i(x,r), (108) 

where <5i(x, r) = Zl| + ' ) (r)(5 1 (x) is the linearly extrapolated density field. The 
idea behind this approximation is that since 4> oc 5/k 2 , the gravitational po¬ 
tential is dominated by long-wavelength modes more than the density field, 
and therefore it ought to obey linear PT to a better approximation. 

In the frozen flow approximation [433], the velocity field is instead assumed to 
remain linear, 

6»(x, r) = -H(r)f(Q m , J2a)<5i(x, t), (109) 

i.e. the velocity field kernels Gffl = 0 (n > 1). In the next chapters we will 
briefly review how these different approximations compare in the weakly non¬ 
linear regime [470,471,47,557], see e.g. Table 4 in Chapter 5. 

2.9 Numerical Simulations 

2.9.1 Introduction 

Cosmological dark matter simulations have become a central tool in predicting 
the evolution of structure in the universe well into the non-linear regime. 
Current state of the art numerical simulations can follow the dynamics of 
about 10 9 particles (see e.g. [163]), which although impressive, is still tens of 
orders of magnitude smaller than the number of dark matter particles expected 
in a cosmological volume, as mentioned in the introduction. 

9 An attempt to see how this equation might arise from the physics of multi stream¬ 
ing has been given in [109]. 
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However, this is not an insurmountable limitation. As we discussed in sec¬ 
tion 2.1, in the limit that the number of particles N 3> 1, collisionless dark 
matter obeys the Vlasov equation for the distribution function in phase space, 
Eq. (12). The task of numerical simulations is to sample this distribution 
by partitioning phase space into N elementary volumes, “particles” with posi¬ 
tions, velocities and (possibly different) masses m*, i — 1, • • •, N, and following 
the evolution of these test particles due to the action of gravity and the ex¬ 
pansion of the universe (technically, these particles obey the equations of the 
characteristics of the Vlasov equation). The number of particles N fixes the 
mass resolution of the numerical simulation. 

Each particle i can be thought of as carrying a “smooth” density profile, which 
can be viewed as a “cloud” of typical size e^. The parameter e; is called the 
softening length (associated to particle i). In general, e; oc m^ 3 . This softening 
is introduced to suppress interactions between nearby particles in order to 
reduce V-body relaxation, which is an artifact of the discrete description of 
the distribution function. It fixes the spatial resolution of the simulation. In 
general it is chosen to be a small fraction of the (local or global) mean inter¬ 
particle separation, but this can vary significantly depending on the type of 
code used. 

In this section, we briefly discuss methods used to solve numerically the Vlasov 
equation. A complete discussion of N-body methods is beyond the scope of this 
work, we shall only describe the most common methods closely following [155]; 
for a comprehensive review see e.g. [63]. 

The basic steps in an V-body simulation can be summarized as follows: 

(i) implementation of initial conditions ([379,199], see e.g. [64] and references 
therein for recent developments); 

(ii) calculation of the force by solving the Poisson equation; 

(iii) update of positions and velocities of particles; 

(iv) diagnostics, e.g. tests of energy conservation; 

(v) go back to (ii) until simulation is completed. 

In general, step (iii) is performed with time integrators accurate to second or¬ 
der, preferably symplectic (i.e. that preserve phase-space volume). The Leapfrog 
integrator (e.g., [314]), where velocities and positions are shifted from each 
other by half a time-step, is probably the most common one. The Predictor- 
Corrector scheme is also popular since it allows easy implementation of indi¬ 
vidual, varying time-step per particle (e.g., [601]). Low-order integrators are 
used mostly to minimize the storage of variables for a large number of particles 
whose orbits must be integrated and to reduce the cost of the force calculation. 
Because of the chaotic nature of gravitational dynamics it is not feasible to 
follow very accurately individual particle orbits but only to properly recover 
the properties of bound objects in a statistical sense. 

All the methods that we describe in what follows mainly differ in the calcula¬ 
tion of the force applied to each particle or, in other words, in how the Poisson 
equation is solved. 
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2.9.2 Direct Summation 

Also known as Particle-Particle (PP) method (e.g., [1]), it consists in evalu¬ 
ating the force on each particle by summing directly the influence exerted on 
it by all neighbors. This method is robust but very CPU consuming: scaling 
as 0(N 2 ), it allows a small number of particles, typically N ~ 10 3 — 10 5 . 
It was revived recently by the development of special hardware dedicated to 
the computation of the Newtonian force (e.g. [427]), mostly used for stellar 
dynamics calculations (but see e.g. [243] for a cosmological application). 


2.9.3 The Tree Algorithm 

The tree code is the most natural improvement of the PP method. It uses the 
fact that the influence of remote structures on each particle can be computed 
by performing a multipole expansion on clusters containing many particles. 
With appropriate selection of the clusters, the expansion can be truncated at 
low order. Therefore, the list of interactions on each particle is much shorter 
than in the PP method, of order ~ log IV, resulting in a 0(N log N ) code. 
The practical implementation of the tree-code consists in decomposing hierar¬ 
chically the system on a tree structure, which can be for example a mutually 
nearest neighbor binary tree (e.g., [8]), or a space balanced Oct tree in which 
each branch is a cubical portion of space (e.g., [22,309,89]). Then a criterion 
is applied to see whether or not a given cluster of particles has to be broken 
into smaller pieces (or equivalently, if it is necessary to walk down the tree). 
Various schemes exist (e.g., [545]), the simplest one for the Oct tree [22] con¬ 
sisting in subdividing the cells until the condition s/r < 9 is fulfilled, where s 
is the size of the cell, r is the distance of the cell center of mass to the particle 
and 6 is a tunable parameter of order unity. 

The tree data structure has many advantages: (i) the CPU spent per time- 
step does not depend significantly on the degree of clustering of the system; 
(ii) implementation of individual time-steps per particle is fairly easy and this 
can speed up the simulation significantly; (iii) the use of individual masses 
per particle allows “zooming” in a particular region, for example a cluster, 
a galaxy halo or a void: the location of interest is sampled accurately with 
high resolution particles (with small mass), while tidal effects are modeled by 
low resolution particles of mass increasing with distance from the high reso¬ 
lution region; (iv) implementation on parallel architectures with distributed 
memory is relatively straightforward (e.g., [546,193,601]). However, tree-codes 
are rather demanding in memory (25-35 words per particles, e.g., [163]) and 
accurate handling of periodic boundaries (e.g., [310]) is costly. 

Typically, simulations using the tree-code can involve up to rs_/ 10' — 10 8 par¬ 
ticles if done on parallel supercomputers. They have high spatial resolution, 
of order e ~ A/( 10 — 20), where A is the mean inter-particle distance. 
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2.9.4 The PM Algorithm 

In the Particle-Mesh (PM) method (e.g. see [314,191,454,86]), the mass of each 
particle is interpolated on a fixed grid of size N g (with jV| sites) to compute 
the density. The Poisson equation is solved on the grid, generally by using 
a Fast Fourier transform, then forces are interpolated back on the particles. 
Implementing a PM code is thus rather simple, even on parallel architectures. 
Scaling as 0(N, log N g ), PM simulations have generally the advantage be¬ 
ing low CPU consumers and require reasonable amount of memory. Thus, a 
large number of particles can be used, N 10' —10 9 , and typically N g = N 1//3 
or 21V 1 / 3 . The main advantage and weakness of the PM approach is its low 
spatial resolution. Indeed, the softening parameter is fixed by the size of the 
grid, e ~ L/N g , where L is the size of the box: large softening length reduces 
the effects of IV-body relaxation and allows good phase-space sampling, but 
considerably narrows the available dynamic scale range. To achieve a spatial 
resolution comparable to that of a tree-code while keeping the advantage of 
the PM code, very large values of N g and N would be needed, implying a 
tremendous cost both in memory and in CPU. 

2.9.5 Hybrid Methods 

To increase spatial resolution of the PM approach, several improvements have 
been suggested. 

The most popular one is the P 3 M code (PP+PM) where the PM force is 
supplemented with a short-range contribution obtained by direct summation 
of individual interactions between nearby particles (e.g., [314,199]). Imple¬ 
mentation of this code on a parallel supercomputer (T3E) produced a very 
large cosmological simulation with 10 9 particles in a “Hubble” volume of size 
L = 2000/r -1 Mpc [420]. The main caveat of the P 3 M approach is that as the 
system evolves to a more clustered state, the time spent in calculation of PP 
interactions becomes increasingly significant. To reduce the slowing-down due 
to PP interactions, it was proposed to use a hierarchy of adaptive meshes in 
regions of high particle density [162], giving birth to a very efficient IV-body 
code, the Adaptive P 3 M (AP 3 M). 

Instead of direct PP summations to correct the PM force for short range 
interactions, it is possible to use a tree algorithm in high density regions [695] 
or in all PM cells [12] similarly as in the P 3 M code. Both these methods are 
potentially faster than their P 3 M competitor. 

In the same spirit as in AP 3 M, but without the PP part, another alter¬ 
native is to use Adaptive Mesh Refinement (AMR): the PM mesh is in¬ 
creased locally when required with a hierarchy of nested rectangular sub¬ 
grids (e.g., [675,6,341,264]). The forces can be computed at each level of the 
hierarchy by a Fourier transform with appropriate boundary conditions. In 
fact, the sub-grids need not be rectangular if one uses Oct tree structures, 
which is theoretically even more efficient. In this adaptive refinement tree 
(ART) method [386], the Poisson equation is solved by relaxation methods 
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(e-g.| [314,532]). 

Finally, it is worth mentioning a Lagrangian approach, which consists in using 
a mesh with fixed size like in the PM code, but moving with the flow so that 
resolution increases in high density regions and decreases elsewhere [269,516]. 
However, this potentially powerful method presents some difficulties, e.g. mesh 
distortions may induce severe force anisotropies. 
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3 Random Cosmic Fields and their Statistical Description 

In this chapter we succinctly recall current ideas about the physical origin 
of stochasticity in cosmic fields in different cosmological scenarios. We then 
present the statistical tools that are commonly used to describe random cosmic 
fields such as power spectra, probability distribution functions, moments and 
cumulants, and give some mathematical properties of interest. 

3.1 The Need for a Statistical Approach 

As we shall review in detail in the following chapters, the current explanation 
of the large-scale structure of the universe is that the present distribution of 
matter on cosmological scales results from the growth of primordial, small, seed 
fluctuations on an otherwise homogeneous universe amplified by gravitational 
instability. Tests of cosmological theories which characterize these primordial 
seeds are not deterministic in nature but rather statistical, for the following 
reasons. First, we do not have direct observational access to primordial fluc¬ 
tuations (which would provide definite initial conditions for the deterministic 
evolution equations). In addition, the time-scale for cosmological evolution is 
so much longer than that over which we can make observations, that is not 
possible to follow the evolution of single systems. In other words, what we 
observe through our the past light cone is different objects at different times 
of their evolution, therefore testing the evolution of structure must be done 
statistically. 

The observable universe is thus modeled as a stochastic realization of a sta¬ 
tistical ensemble of possibilities. The goal is to make statistical predictions, 
which in turn depend on the statistical properties of the primordial perturba¬ 
tions leading to the formation of large-scale structures. Among the two classes 
of models that have emerged to explain the large-scale structure of the uni¬ 
verse, the physical origin of stochasticity can be quite different and thus give 
rise to very different predictions. 

The most widely considered models, based on the inflationary paradigm [279], 
generically give birth to adiabaticf 10 ] Gaussian initial fluctuations, at least 
in the simplest single-field models [602,304,280,20]. In this case the origin of 
stochasticity lies on quantum fluctuations generated in the early universe; we 
will consider this case in more detail below. However, one should keep in mind 
that inflation is not necessarily the only mechanism that leads to Gaussian, or 
almost Gaussian, initial conditions. For instance, topological defects based on 
the non-linear u-model in the large 77-limit would also give Gaussian initial 
conditions [655,333]. And in general the central limit theorem ensures that 


10 As opposed to isocurvature fluctuations which is a set of individual perturbations 
such that the total fluctuation amplitude vanishes. In the adiabatic case, the total 
amplitude does not vanish and this leads to perturbations in the spatial curvature. 
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such initial conditions are likely to happen in very broad classes of models. 
The second class of models that have been developed for structure formation 
are based on topological defects, of which cosmic strings have been studied in 
most detail. In this case the origin of stochasticity lies on thermal fluctuations 
of a held that undergoes a phase transition as the universe cools, and is likely 
to obey non-Gaussian properties. Note however that these two classes of mod¬ 
els do not necessarily exclude each other. For instance, formation of cosmic 
strings are encountered in specific models of inflation [65,66,352], There are 
also models inspired by duality properties of superstring theories, in which an 
inflationary phase can be encountered but structure formation is caused by 
the quantum fluctuations of the axion held ^1 [668,159,111] rather than the 
inhaton held. With such a mechanism the initial metric huctuations will not 
obey Gaussian statistics. 


3.1.1 Physical Origin of Fluctuations from Inflation 

In models of inflation the stochastic properties of the helds originate from 
quantum huctuations of a scalar held, the inhaton. It is beyond the scope 
of this review to describe inhationary models in any detail. We instead re¬ 
fer the reader to recent reviews for a complete discussion [399,400,415]. It is 
worth however recalling that in such models (at least for the simplest single- 
held models within the slow-roll approximation) all huctuations originate from 
scalar adiabatic perturbations. During the inhationary phase the energy den¬ 
sity of the universe is dominated by the density stored in the inhaton held. 
This held has quantum huctuations that can be decomposed in Fourier modes 
using the creation and annihilation operators aj, and ak for a wave mode k, 


dip = d 3 k akfjkft) exp(ik.x) + f>l(t) exp(—ik.x) 


( 110 ) 


The operators obey the standard commutation relation, 

[a k , aLk'] - ^ d (k + k'), 


( 111 ) 


and the mode functions 'ifkit) are obtained from the Klein-Gordon equation 
for ip in an expanding Universe. We give here its expression for a de-Sitter 
metric (i.e. when the spatial sections are hat and H is constant), 


if kit) 


H 


(: 2k) l Ok 


(. k \ 

i k 

( 1+ aHj eXP 

aH 


( 112 ) 


where a and H are respectively the expansion factor and the Hubble con¬ 
stant that are determined by the overall content of the Universe through the 
Friedmann equations, Eqs. (4-5). 

11 However, this generally leads to isocurvature fluctuations rather than adiabatic. 
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When the modes exit the Hubble radius, k/(aH ) <C 1, one can see from 
Eq. (112) that the dominant mode reads, 


¥>k ~ + a ' k ) ’ S(p = / d3k ^ k elkX ' ( 113 ) 

Thus these modes are all proportional to a k + al k . One important conse¬ 
quence of this is that the quantum nature of the fluctuations has disap¬ 
peared [281,375,376]: any combinations of y? k commute with each other. The 
held ip can then be seen as a classic stochastic held where ensemble averages 
identify with vacuum expectation values, 

(■■■) = <0|-|0). (114) 

After the inflationary phase the modes re-enter the Hubble radius. They leave 
imprints of their energy fluctuations in the gravitational potential, the sta¬ 
tistical properties of which can therefore be deduced from Eqs. (Ill, 113). 
All subsequent stochasticity that appears in the cosmic fields can thus be 
expressed in terms of the random variable <^ k . 


3.1.2 Physical Origin of Fluctuations from Topological Defects 
In models of structure formation with topological defects, stochasticity origi¬ 
nates from thermal fluctuations. One important difficulty in this case is that 
topological defects generally behave as active seeds, and except in some special 
cases (see for instance [194]), the dynamical evolution of these seeds is nonlin¬ 
ear and nonlocal, hence requiring heavy numerical calculation for their descrip¬ 
tion. This is in particular true for cosmic strings that form a network whose 
evolution is extremely complex (see for instance [90]). Therefore in this case it 
is not possible to write down in general how the stochasticity in cosmic fields 
relates to more fundamental processes. See [674] for a review of the physics 
of topological defects. Current observations of multiple acoustic peaks in the 
power spectrum of microwave background anisotropies severely constrain sig¬ 
nificant contributions to perturbations from active seeds [476,282,397]. 


3.2 Correlation Functions and Power Spectra 

From now on, we consider a cosmic scalar field whose statistical properties we 
want to describe. This held can either be the cosmic density held, <5(x), the 
cosmic gravitational potential, the velocity divergence held, or any other held 
of interest. 
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3.2.1 Statistical Homogeneity and Isotropy 

A random field is called statistically homogeneous 12 if all the joint multi¬ 
point probability distribution functions p(Si, S 2 ,...) or its moments, ensemble 
averages of local density products, remain the same under translation of the 
coordinates xi,x 2 ,... in space (here 5i = h(x*)). Thus the probabilities de¬ 
pend only on the relative positions. A stochastic held is called statistically 
isotropic if p(Si, S 2 , ■ ■.) is invariant under spatial rotations. We will assume 
that cosmic fields are statistically homogeneous and isotropic, as predicted by 
most cosmological theories. The validity of this assumption can and should 
be tested against the observational data. Examples of primordial fields which 
do not obey statistical homogeneity and isotropy are fluctuations in compact 
hyperbolic spaces (see e.g. [82]). Furthermore, redshift distortions in galaxy 
redshift surveys introduce significant deviations from statistical isotropy and 
homogeneity in the redshift-space density held, as will be reviewed in Chap¬ 
ter 7. 

3.2.2 Two-Point Correlation Function and Power Spectrum 

The two-point correlation function is defined as the joint ensemble average of 
the density at two different locations, 

^(r) = (h( x )h( x + r)), (115) 

which depends only on the norm of r due to statistical homogeneity and 
isotropy. The density contrast <5(x) is usually written in terms of its Fourier 
components, 

<5(x) = J d 3 kh(k) exp(ik-x). (116) 

The quantities <5(k) are then complex random variables. As 6 (x) is real, it 
follows that 

5(k) = <f(—k). (117) 

The density held is therefore determined entirely by the statistical properties 
of the random variable <5(k). We can compute the correlators in Fourier space, 

(<$(k)<$(k')) = j £ (£( x )5( x + r )) exp[—i(k + k') • x - ik' ■ r](118) 

which gives, 

rrr 

(2?r) 3 (2?r) 3 ^ ex P[ -i ( k + k 0 ' x “ ik/ ■ r ] 

12 This is in contrast with a homogeneous field, which takes the same value every¬ 
where in space. 
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= S D (k + k') j exp(ik-r) 

= d D (k + k ')P(fc), (119) 

where P(k) is by definition the density power spectrum. The inverse relation 
between two-point correlation function and power spectrum thus reads 

f (r) = J d 3 k P{k) exp(ik ■ r). (120) 

There are basically two conventions in the literature regarding the definition 
of the power spectrum, which differ by a factor of (27 t) 3 . In this review we 
use the convention in Eqs. (36), (116) and (119) which lead to Eq. (120). 
Another popular choice is to reverse the role of (27r) 3 factors in the Fourier 
transforms, i.e. <5(k) = / d 3 rexp(—ik ■ r)5(r), and then modify Eq. (119) to 
read (<5(k)5(k / )) = (27r) 3 h£,(k + k ') P(k), which leads to k 3 P(k)/(2n 2 ) being 
the contribution per logarithmic wavenumber to the variance, rather than 
47 t k 3 P[k) as in our case. 

3.2.3 The Wick Theorem for Gaussian Fields 

The power spectrum is a well defined quantity for almost all homogeneous 
random fields. This concept becomes however extremely fruitful when one 
considers a Gaussian field. It means that any joint distribution of local densi¬ 
ties is Gaussian distributed. Any ensemble average of product of variables can 
then be obtained by product of ensemble averages of pairs. We write explicitly 
this property for the Fourier modes as it will be used extensively in this work, 

(<5(k 1 )...<5(k 2p+1 )> = 0 (121) 

(i(k 1 )...«S(k 2p ))= £ n WfcWkj)) ( 122 ) 

all pair associations p pairs (i,j) 

This is the Wick theorem , a fundamental theorem for classic and quantum 
field theories. 

The statistical properties of the random variables <5(k) are then entirely de¬ 
termined by the shape and normalization of P(k). A specific cosmological 
model will eventually be determined e.g. by the power spectrum in the linear 
regime, by Q m and Ga only as long as one is only interested in the dark matter 
behavioral. 

As mentioned in the previous section, in the case of an inflationary scenario the 
initial energy fluctuations are expected to be distributed as a Gaussian random 
field [602,304,280,20]. This is a consequence of the commutation rules given 

13 Note that there are now emerging models with a non-standard vacuum equation 
of state, the so-called quintessence models [536,707], in which the vacuum energy is 
that of a non-static scalar field. In this case the detailed behavior of the large-scale 
structure growth will depend on the dynamical evolution of the quintessence field. 
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< 8 ,> c = • 


<5, = 


<8,8283^ = 



^8j §2 S3 84 > c - 



Fig. 1. Representation of the connected part of the moments. 


<5, 5, 83) = 



/ + *V 


Fig. 2. Writing of the three-point moment in terms of connected parts. 

by Eq. (Ill) for the creation and annihilation operators for a free quantum 
field. They imply that 





fo(k + k'). 


(123) 


As a consequence of this, the relations in Eqs. (121-122) are verified for 
for all modes that exit the Hubble radius, which long afterwards come back 
in as classical stochastic perturbations. These properties obviously apply also 
to any quantities linearly related to (p^. 


3.2.4 Higher-Order Correlators: Diagrammatics 

In general it is possible to define higher-order correlation functions. They are 
defined as the connected part (denoted with subscript c) of the joint ensemble 
average of the density in an arbitrarily number of locations. They can be 
formally written, 

6v(x 1 , • • - ,Xjv) = (<5(xi),.. ,,5(x n )) c (124) 

= (5(x 1 ),...,5(x 7 v)) - 

e n £#s*( X s*(l)> * ■ • > X Si(# s »)) ( 128 ) 

5e)P({xi,...,x„}) Si£S 

where the sum is made over the proper partitions (any partition except the 
set itself) of {xi,..., x^} and Sj is thus a subset of {xi,..., x^v} contained in 
partition S. When the average of <5(x) is defined as zero, only partitions that 
contain no singlets contribute. 

The decomposition in connected and non-connected parts can be easily visu¬ 
alized. It means that any ensemble average can be decomposed in a product 
of connected parts. They are defined for instance in Fig. 1. The tree-point 
moment is “written 1 ' in Fig. 2 and the four-point moment in Fig. 3. 
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<6>=V <M 2 > = 0,0 + 2 *0* <8 1 5 2 > c = 20 

Fig. 4. Disconnected and connected part of the two-point function of the field 5 
assuming it is given by 5 = 4> 2 with cj> Gaussian. 


In case of a Gaussian field all connected correlation functions are zero except 
£ 2 - This is a consequence of Wick’s theorem. As a result the only non-zero con¬ 
nected part is the two-point correlation function. An important consequence 
is that the statistical properties of any held, not necessarily linear, built from 
a Gaussian held 5 can be written in terms of combinations of two-point func¬ 
tions of 5. Note that in a diagrammatic representation the connected moments 
of any of such held is represented by a connected graph. This is illustrated in 
Fig. 4 for the held 6 = 0 2 : the connected part of the 2-point function of this 
held is obtained by all the diagrams that explicitly join the two points. The 
other ones contribute to the moments, but not to its connected part. 

The connected part has the important property that it vanishes when one 
or more points are separated by infinite separation. In addition, it provides 
a useful way of characterizing the statistical properties, since unlike uncon¬ 
nected correlation functions, each connected correlation provides independent 
information. 

These definitions can be extended to Fourier space. Because of homogeneity 
of space (h(ki)... <5(kjv)) c is always proportional to holki + ... + k^). Then 
we can define / J v(ki..., kjy) with 


(h(kx)... h(kjv)) c — ^o(ki + ... + k^v) Pjv(ki,.... k N ). (126) 

One particular case that will be discussed in the following is for n = 3, the 
bispectrum, which is usually denoted by 5(k 1 ,k 2 ,k 3 ). 
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3.2.5 Probabilities and Correlation Functions 

Correlation functions are directly related to the multi-point probability func¬ 
tion, in fact they can be defined from them. Here we illustrate this for the case 
of the density field, as these results are frequently used in the literature. The 
physical interpretation of the two-point correlation function is that it measures 
the excess over random probability that two particles at volume elements dV\ 
and dV 2 are separated by distance x V2 = |x x — x 2 |, 

dP 12 = n 2 [l + axn)}dV 1 dV 2 , (127) 

where n is the mean density. If there is no clustering (random distribution), 
£ = 0 and the probability of having a pair of particles is just given by the mean 
density squared, independently of distance. Since the probability of having a 
particle in dX j is ndV\ , the conditional probability that there is a particle at 
dX 2 given that there is one at dX\ is 

dP( 2 |l) = n[l + £(xi 2 )]dy 2 . ( 128 ) 

The nature of clustering is clear from this expression; if objects are clustered 
(£(^ 12 ) > 0 ), then the conditional probability is enhanced, whereas if objects 
are anticorrelated (£(xi 2 ) < 0 ) the conditional probability is suppressed over 
the random distribution case, as expected. Similarly to Eq. (127), for the 
three-point case the probability of having three objects is given by 


dPm = n 3 [l + £(^ 12 ) + £(^ 23 ) + £(® 3 i) + £ 3 (^ 12 , £ 23 , ^3i)]dVidV^dV^,(129) 


where £3 denotes the three-point (connected) correlation function. If the den¬ 
sity field were Gaussian, £3 = 0, and all probabilities are determined by £(r) 
alone. Analogous results hold for higher-order correlations (e.g. see [508]). 


3.3 Moments, Cumulants and their Generating Functions 
3.3.1 Moments and Cumulants 

One particular case for Eq. (125) is when all points are at the same location. 
Because of statistical homogeneity £ p (x,..., x) is independent on the position 
x and it reduces to the cumulants of the one-point density probability dis¬ 
tribution functions, ( 5 P ) C . The relation (125) tells us also how the cumulants 
are related to the moments (d p ). For convenience we write here the first few 
terms, 

(8)c = (8) 
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( 130 ) 


(P) e = o* = (P)~(6) 2 e 

(5 3 ) c = (5 3 )- 3<5 2 ) c (5) c -<5> 3 

(5 4 ) c = <5 4 > - 4(5 3 ) c (5) c - 3<5 2 ) 2 - 6(5 2 ) c (5) 2 - {5)1 

(5 5 ) c = (5 5 ) - 5{5 4 ) c {5) c - 10{5 3 ) c {5 2 ) c - 10{5 3 ) C {5) 2 C - 15{5 2 ) 2 C {5) C 

-10(5 2 )c(5) 3 c - {5)1 

In most cases {5) = 0 and the above equations simplify considerably. In the 
following we usually denote cr 2 the local second order cumulant. The Wick 
theorem then implies that in case of a Gaussian held cr 2 is the only non¬ 
vanishing cumulant. 

It is important to note that the local PDF is essentially characterized by its 
cumulants which constitute a set of independent quantities. This is important 
since in most of applications that follow the higher-order cumulants are small 
compared to their associated moments. Finally, let’s note that a useful mathe¬ 
matical property of cumulants is that (( b5) n ) c = b n {5 n ) c , and ((&+<5) n ) c = {5 n ) c 
where b is an ordinary number. 

3.3.2 Smoothing 

The density distribution is usually smoothed with a filter Wr of a given size, 
R , commonly a top-hat or a Gaussian window. Indeed, this is required by the 
discrete nature of galaxy catalogs and iV-body experiments used to simulate 
them. Moreover, we shall see later that the scale-free nature of gravitational 
clustering implies some remarkable properties about the scaling behavior of 
the smoothed density distribution. The quantities of interest are then the 
moments {5r) and the cumulants {5r) c of the smoothed density held 

5r ( x ) = J W R (x.' - x)<5(x')d 3 x'. (131) 

Note that for the top hat window, 

d® x , d D x 

£ ; ,(xx /; ) - ? -- (132) 

V R 

(where T> — 2 or 3 is the dimension of the held) is nothing but the average of 
the iV-point correlation function over the corresponding cell of volume v R . 
For a smooth held, equations in Sect. 3.3.1 are valid for 5 as well as 5r. Some 
corrections are required if 5 is a sum of Dirac delta functions as in real galaxy 
catalogs. We shall come back to this in Chapter 6. 

In the remaining of this chapter, we shall omit the subscript R which stands 
for smoothing, but it will be implicitly assumed. 
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3.3.3 Generating Functions 

It is convenient to define a function from which all moments can be generated, 
namely the moment generating function defined by 


00 (S p ) + r° 

M(t) = 2 — = / p(S)e t5 dS = (exp (td)). 
p=o P- ' 


(133) 


The moments can obviously obtained by subsequent derivatives of this func¬ 
tion at the origin t = 0 . A cumulant generating function can similarly be 
defined by 


00 (S p ) 

c(t) = 2 ^t p . 


r 2 p\ 


p =2 


(134) 


A fundamental result is that the cumulant generating function is given by the 
logarithm of the moment generation function (see e.g. appendix D in [67] for 
a proof) 

M.[t) — exp[C(f)]. (135) 


I 11 case of a Gaussian PDF, this is straightforward to check since (exp(fh)) = 
exp(cr 2 t 2 / 2 ). 


3-4 Probability Distribution Functions 

The probability distribution function (PDF) of the local density can be ob¬ 
tained from the cumulant generating function by inverting Eq. (133)[^]. This 
inverse relation involves the inverse Laplace transform, and can formally be 
written in terms of an integral in the complex plane (see [16] and Appendix E 
for a detailed account of this relation), 

ioo 

P{8) = j 7^7 exp [tS + C{t)]. (136) 

—ioo 


For a Gaussian distribution the change of variable t —> if gives the familiar 
Gaussian integral. 


14 However, it may happen that the moment or cumulant generating function is 
not defined because of the lack of convergence of the series in Eq. (133). In this 
case the PDF is not uniquely defined by its moments. In particular, this is the case 
for the log-normal distribution. There are indeed other PDF’s that have the same 
moments [312], 
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This can be easily generalized to multidimensional PDF’s. We then have 


m,.. 


ioo 


—ioo 


dti 

27ri 


d 


27ri 


7 exp 



+ C(ti,... ,t p ) 


,(137) 


with 

C(t h ...,t p ) 


9l,-,9p 


9i! ■■■%>!' 


(138) 


3.5 Weakly Non-Gaussian Distributions: Edgeworth Expansion 


Throughout this review we will be often dealing with helds that depart only 
weakly from a Gaussian distribution. To be more specific, they depart in such 
a way that 

(5 P ) C ~ a 2p ~ 2 (139) 


when cr is smalip 77 ] It is then natural to define the coefficient S p as 


S„ 


(* P )c 

jj2p—2 


(140) 


(similar definitions will be introduced subsequently for the other helds). In¬ 
troducing the S p generating function (sometimes also called the cumulant 
generating function) with, 

<pi.v) = E S p ^-L— y” = -a 2 C(-ylc 2 ) (141) 

p= 2 P- 


we get from Eq. (136), 


P(h)d<5 


dh 

2 nicr 2 



— IOO 




(142) 


Then a number of approximations and truncations can be applied to this 
expression to decompose the local PDF. This leads to the Edgeworth form 
of the Gram-Charlier series [609] applied to statistics of weakly nonlinear 
helds. This expansion was derived initially in [405,406] and later proposed in 
cosmological contexts [552,49,356] 

The Edgeworth expansion can be derived from Eq. (142) of the density PDF 
assuming that the density contrast 5 is of the order of a and small. The relevant 

15 This is a consequence of Gaussian initial conditions and the fact that non- 
linearities in the equations of motion are quadratic, see Chapter 4. 
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values of y are then also of the order of a and are thus expected to be small. 
It is then legitimate to expand the function tp(y) 


<p{y) ~ - \y 2 + y 3 - y A + y 5 ± • • • • ( 14 3 ) 

To calculate the density PDF, we substitute the expansion (143) into the 
integral in Eq. (142). Then we make a further expansion of the non-Gaussian 
part of the factor exp [— ip(y)/a 2 } with respect to both y and a assuming they 
are of the same order. 

Finally, collecting the terms of the same order in a we obtain the so-called 
Edgeworth form of the Gram-Charlier series for density PDF, 


P(5)dS = 


(27ra 2 )V2 

x 


exp 


R/2) 


x 


+0" 


1 + a^H, (0 + a 2 (^ff 4 (v) + H H e („] 

“/ u (,,\ \ S4S3 u ( \ , 'S'f U (,,\ I 1 

I - JLIkIu) - Jtl 7\iy) + - Hq\V) ) + ... 

V120 V ; 144 V ; 1296 V 


d<y, (144) 


where v — 8/a and H n (y ) are the Hermite polynomials 


H n ( u) = (-lTexp(v 2 /2) — exp(-v 2 /2) 


n—4 


n(n — 1) u n 2 n(n — l)(n — 2)(n — 3) v 
= V’ - V -■ ; — + —- - ~ - -- - • • •, (145) 


1! 


2 ! 


2 2 


thus 


H ?J (v) = v 3 — 3z/, (146) 

H±{v) = z/ 4 — 6 v 2 + 3, (147) 

H 5 (u) = z/ 5 — 10 z/ 3 + 15 v, (148) 


This is a universal form for any slightly non-Gaussian held, i.e. when a is 
small and S p are hnite. Note that the parameters S p might vary weakly with 
a affecting the expansion (144) beyond the third-order term (see [49]). 

With such an approach, it is possible to get an approximate form of the density 
PDF from a few known low-order cumulants. This method is irreplaceable 
when only a few cumulants have been derived from Erst principles. However, 
it is important to note that this expansion is valid only in the slightly non- 
Gaussian regime. The validity domain of the form (144) is limited to hnite 
values of S/a, typically S/a < 0.5. 

A well-known problem with the Edgeworth expansion is that it does not give 
a positive definite PDF, in particular this manifests itself in the tails of the 
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distribution. To improve this behavior, an Edgeworth-like expansion about 
the Gamma PDF (which has exponential tails) has been explored in [258]. 
To bypass the positivity problem, it was proposed to apply the Edgeworth 
expansion to the logarithm of the density instead of the density itself [148]. 
With this change of variable, motivated by dynamics [136], the approximation 
works well even into the nonlinear regime for a 2 < 10 [148,656]. 

Extensions of Eq. (144) have been written for joint PDF’s [406,409]. Note that 
it can be done only when the cross-correlation matrix between the variables 
is regular (see [56] for details). 
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4 From Dynamics to Statistics: iV-Point Results 


A general approach to go from dynamics to statistics would be to solve the 
Vlasov equation from initial conditions for the phase-space density function 
/(x, p) given by a stochastic process such as inflation. Correlation functions 
in configuration space reviewed in Chapter 3 can be trivially extended to 
phase-space, and the Vlasov equation yields equations of motion for these 
phase-space correlation functions. The result is a set of coupled non-linear 
integro-differential equations, the so-called BBGKY hierarchy^], in which the 
one-point density is related to the two-point phase-space correlation function, 
the two-point depends on the three-point, and so forth. However, as mentioned 
in Chapter 2, if we restrict ourselves to the single stream regime study of the 
Vlasov equation reduces to studying the evolution of the density and velocity 
fields given by the continuity, Euler and Poisson equations. Therefore, all we 
have to consider in this case is the correlation functions of density and velocity 
fields. 

In this chapter, we review how the results discussed in Chapter 2 about the 
time evolution of density and velocity fields can be used to understand the 
evolution of their statistical properties, characterized by correlation functions 
as summarized in the previous chapter. Most of the calculations will be done 
assuming Gaussian initial conditions; in this case the main focus is in quan¬ 
titative understanding of the emergence of non-Gaussianity due to non-linear 
evolution. In Sect. 4.4 we discuss results derived from non-Gaussian initial 
conditions. In Chapter 5 we present, with similar structure, analogous results 
for one-point statistics, with emphasis on the evolution of local moments and 
PDF’s. 

4-1 The Weakly Non-Linear Regime: “Tree-Level” PT 
4-1.1 Emergence of Non-Gaussianity 

If the cosmic fields are Gaussian, their power spectrum P(k,r), 

(S( k, r)5(k', r)) c = S D { k + k ')P(k, r). (149) 

(or, equivalently, their two-point correlation function) completely describes 
the statistical properties. However, as we saw in Chapter 2, the dynamics 
of gravitational instability is non-linear, and therefore non-linear evolution 
inevitably leads to the development of non-Gaussian features. 

16 after N. N. Bogoliubov, M. Born, H. S. Green, J. G. Kirkwood and J. Yvon, who 
independently obtained the set of equation between 1935 and 1962. Rigorously, this 
route from the Vlasov equation to the BBGKY equations is restricted to the so- 
called “fluid limit” in which the number of particles is effectively infinite and there 
are no relaxation effects. 
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<5(1)S(2)S(3)> = 



Fig. 5. Tree diagrams for the three-point function or bispectrum. 


< 5 ( 1 ) 5 ( 2 ) 5 ( 3 ) 6 ( 4 ) % = 




Fig. 6. Tree diagrams for the four-point function or trispectrum. 

The statistical characterization of non-Gaussian fields is, in general, a non¬ 
trivial subject. As we discussed in the previous chapter, the problem is that in 
principle all N —point correlation functions are needed to specify the statistical 
properties of cosmic fields. In fact, for general non-Gaussian fields, it is not 
clear that correlation functions (either in real or Fourier space) are the best 
set of quantities that describes the statistics in the most useful way. 

The situation is somewhat different for gravitational clustering from Gaussian 
initial conditions. Here it is possible to calculate in a model-independent way 
precisely how the non-Gaussian features arise, and what is the most natural 
statistical description. In particular, since the non-linearities in the equations 
of motion are quadratic, gravitational instability generates connected higher 
order correlation functions that scale as oc at large scales, where 

£2 -C 1 and PT applies [232], This scaling can be naturally represented by 
connected tree diagrams, where each link represents the two-point function 
(or power spectrum in Fourier space), since for N points (N — 1) links are 
necessary to connect them in a tree-like fashion. 

As a consequence of this scaling, the so-called hierarchical amplitudes Qn 
defined by 


Qn = 


E 


labelings 


6v 

f-rAf-i 
Hedges ij 


&(r y )’ 


(150) 


where the denominator is given by all the topological distinct tree diagrams 
(the different N n ~ 2 ways of drawing IV — 1 links that connect N points), 
are a very useful set of statistical quantities to describe the properties of 
cosmic fields. In particular, they are independent of the amplitude of the two- 
point function, and for scale-free initial conditions they are independent of 
overall scale. As we shall see, the usefulness of these statistics is not just 
restricted to the weakly non-linear regime (large scales); in fact, there are 
reasons to expect that in the opposite regime, at small scales where £ 2 ^ 1, 
the scaling £,v oc is recovered. In this sense, the hierarchical amplitudes 
Qn (and their one-point cousins, the S p parameters) are the most natural set 
of statistics to describe the non-Gaussianity that results from gravitational 
clustering. 
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Figures 5 and 6 show the tree diagrams that describe the three- and four-point 
function induced by gravity. As we already said, N — 1 links (representing £ 2 ) 
are needed to describe the connected Appoint function, and furthermore, the 
number of lines coming out of a given vertex is the order in PT that gives 
rise to such a diagram. For example, the diagram in Fig. 5 requires linear and 
second order PT, representing (5 2 (l)hi(2)hi(3)) c (as in Chapter 2, subscripts 
describe the order in PT). On the other hand, the diagrams in Fig. 6 require up 
to third-order in PT. The first term represents (5 i(1)5 2 (2)<5 2 (3)<5i(4)) c whereas 
the second describes (hi(l)<5 3 (2)<5i(3)<5i(4)) c . 

In general, a consistent calculation of the connected p —point function induced 
by gravity to leading order (“tree-level”) requires from first to (p — l) th order 
in PT [232], At large scales, where ( 2 Cl, tree-level PT leads to hierarchical 
amplitudes Qn which are independent of £ 2 . As £ 2 —► 1, there are corrections 
to tree-level PT which describe the £ 2 dependence of the Qn amplitudes. These 
are naturally described in terms of diagrams as well, in particular, the next 
to leading order contributions (“one-loop” corrections) require from first to 
(p + l) th order in PT [557]. These are represented by one-loop diagrams, i.e. 
connected diagrams where there is one closed loop. The additional link over a 
tree diagram required to form a closed loop leads to Qn oc £ 2 . 

Figures 7 and 8 show the one-loop diagrams for the power spectrum and 
bispectrum. The one-loop corrections to the power spectrum (the two terms 
in square brackets in Fig. 7) describe the non-linear corrections to the linear 
evolution, that is, the effects of mode-coupling and the onset of non-linear 
structure growth. Recall that each line in a diagram represents the power 
spectrum p(°\k) (or two-point function) of the linear density held. As a result, 
the one-loop power spectrum scales P^\k) oc P^\k) 2 . 

Are all these diagrams really necessary? In essence, what the diagrammatic 
representation does is to order the contributions of the same order irrespective 
of the statistical quantity being considered. For example, it is not consistent to 
consider the evolution of the power spectrum in second-order PT (second term 
in Fig 7) since there is a contribution of the same order coming from third- 
order PT (third term in Fig 7). Instead, one should consider the evolution of 
the power spectrum to “one-loop” PT (which includes the two contributions 
of the same order, the terms in square brackets in Fig 7). A similar situa¬ 
tion happens with the connected four-point function induced by gravity; it is 
inconsistent to calculate it in second-order PT (first term in Fig 6), rather 
a consistent calculation of the four-point function to leading order requires 
“tree-level” PT (which also involves third-order PT, i.e. the second term in 
Fig 6). 

We will now review results on the evolution of different statistical quantities 
in tree-level PT. 
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Fig. 7. Diagrams for the two-point function or power spectrum up to one-loop. See 
Eqs. (165) and (166) for one-loop diagram amplitudes. 
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Fig. 8. Diagrams for the three-point function or bispectrum up to one-loop. 
4-1-2 Power Spectrum Evolution in Linear PT 

The simplest (trivial) application of PT is the leading order contribution to 
the evolution of the power spectrum. Since we are dealing with the two-point 
function in Fourier space (N = 2), only linear theory is required, that is, the 
connected part is just given by a single line joining the two points. 

In this review we are concerned about time evolution of the cosmic fields 
during the matter domination epoch. In this case, as we discussed in Chapter 
2, diffusion effects are negligible and the evolution can be cast in terms of 
perfect fluid equations that describe conservation of mass and momentum. In 
this case, the evolution of the density field is given by a simple time-dependent 
scaling of the “linear” power spectrum 

P(fc,T) = [D< +) (T )] 2 P L (k) (151) 

where D{ + (t) is the growing part of the linear growth factor. One must 
note, however, that the “linear” power spectrum specified by Pz/fc) PH de¬ 
rives from the linear evolution of density fluctuations through the radiation 
domination era and the resulting decoupling of matter from radiation. This 
evolution must be followed by using general relativistic Boltzmann numerical 
codes [499,76,416,578], although analytic techniques can be used to understand 
quantitatively the results [320,321], The end result is that 

P L (k) = k np T 2 {k) (152) 

where n p is the primordial spectral index (n p = 1 denotes the canonical scale- 
invariant spectrum [300,706,499]^^]), T{k ) is the transfer function that de¬ 
scribes the evolution of the density field perturbations through decoupling 
(T(0) = 1). It depends on cosmological parameters in a complicated way, 
although in simple cases (where the baryonic content is negligible) it can 

17 We denote the linear power spectrum interchangeably by Pl(/c) or by p(°\k). 

18 This corresponds to fluctuations in the gravitational potential at the Hubble ra¬ 
dius scale that have the same amplitude for all modes, i.e. the gravitational potential 
has a power spectrum P v ~ k~ 3 , as predicted by inflationary models, see Eq. (113). 


54 






be approximated by a fitting function that depends on the shape parame¬ 
ter T = kl m h [76,21], For the adiabatic cold dark matter (CDM) scenario, 
T 2 (k ) —> In 2 {k)/k i as k —> oo, due to the suppression of fluctuations growth 
during the radiation dominated era, see e.g. [197] for a review. 


4-1-3 The Bispectrum induced by Gravity 

We now focus on the non-linear evolution of the three-point cumulant of the 
density field, the bispectrum £>(ki, k 2 , r), defined by (compare with Eq. 149) 

(<5(ki, T)<5(k 2 , T)<5(k 3 , r)) f = fe(ki + k 2 + k 3 ) B( k 3 , k 2 , r), (153) 


As we discussed already, it is convenient to define the reduced bispectrum Q 
as follows [229,232] 


Q = 


B(ki, k 2 , r) 


P(k i, r)P(k 2 , r) + P(k 2 , r)P(fc 3 , r) + P(k 3 , r)P(k 1 , r) 


(154) 


which has the desirable property that it is scale and time independent to 
lowest order (tree-level) in non-linear PT, 


g (0) 


_ 2F 2 (kj, k 2 )P(fci, r)P(k 2 , r) + eye. _ 

P(h, r)P(k 2 , t) + P(k 2 , r)P(k 3 , r) + P(k 3 , r)P(fci, r) ’ 


(155) 


where P 2 (k 1; k 2 ) denotes the second-order kernel obtained from the equations 
of motion, as in Section 2.4.2. Recall that this kernel is very insensitive to 
cosmological parameters [see Eq.(71)], as a consequence of this, the tree-level 
reduced bispectrum Q (°) is almost independent of cosmology [236,313]. In 
addition, from Eq. (155) it follows that Q ^ is independent of time and nor¬ 
malization [232], Furthermore, for scale-free initial conditions, Pl(^) oc k n , 
g(°) is also independent of overall scale. For the particular case of equilat¬ 
eral configurations (k\ — k 2 — k 3 and k t ■ kj = —0.5 for all pairs), Q ® is 
independent of spectral index as well, Q^q = 4/7. In general, for scale-free 
initial power spectra, Q^ depends on configuration shape through, e.g., the 
ratio k\/k 2 and the angle 6 defined by k\ ■ k 2 — cos 6. In fact, since bias be¬ 
tween the galaxies and the underlying density field is known to change this 
shape dependence [235], measurements of the reduced bispectrum Q in galaxy 
surveys could provide a measure of bias which is insensitive to other cosmolog¬ 
ical parameters [236], unlike the usual determination from peculiar velocities 
which has a degeneracy with the density parameter Q m . We will review these 
applications in Chapter 8. 

Figure 9 shows Q ^ for the triangle configuration given by k\fk 2 = 2 as a 
function of the angle 6 between these wavevectors (cosd = k 3 -k 2 ) for different 
spectral indices. The shape or configuration dependence of Q A* comes from the 
second order perturbation theory kernel (see Eqs. (155) and (170)) and 
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Fig. 9. The tree-level reduced bispectrum Q for triangle configurations given 
by Aq//c 2 = 2 as a function of the angle 0 {k\ ■ /c 2 = cos 9). The different curves 
correspond to spectral indices n = —2, —1.5, —1, —0.5,0 (from top to bottom) 


can be understood in physical terms as follows. From the recursion relations 
given in Chapter 2, we can write 


F 2 w (k,.k 2 ) 


5 

14 


k 2 , ki) + a(k 1? k 2 ) 


+ -j /3(ki, k 2 ), 


(156) 


with a and (3 defined in Eq. (39). The terms in square brackets contribute 
a constant term, independent of configuration, coming from the 9x6 term 
in the equations of motion, plus terms which depend on configuration and 
describe gradients of the density held in the direction of the how (i.e., the 
term u ■ V<5 in the continuity equation). Similarly, the last term in Eq. (156) 
contributes configuration dependent terms which come from gradients of the 
velocity divergence in the direction of the how (due to the term (u ■ V)u in 
Euler’s equation). Therefore, the configuration dependence of the bispectrum 
rehects the anisotropy of structures and hows generated by gravitational in¬ 
stability. The enhancement of correlations for collinear wavevectors (6 = 0, n) 
in Figure 9, rehects the fact that gravitational instability generates density 
and velocity divergence gradients which are mostly parallel to the how [559]. 
The dependence on the spectrum is also easy to understand: models with more 
large-scale power (smaller spectral indices n) give rise to anisotropic structures 
and hows with larger coherence length, which upon ensemble averaging leads 
to a more anisotropic bispectrum. 
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Fig. 10. The tree-level three-point amplitude in real space for triangle con¬ 
figurations given by ri 2 /r 23 = 2 as a function of the angle 6 (ri 2 ■ r 23 = cos#). 
The different curves correspond to spectral indices n = -2,-1.5,-1 (from top to 
bottom at 9 = 0.47 t) 

4-1-4 The Three-Point Correlation Function 

The three-point function £3 can be found straightforwardly by Fourier trans¬ 
formation of the bispectrum, leading to 


£ 3 (xi,x 2 ,x 3 ) = 


y«*l3)5U23) + V5U, 3 ) ' V-'JUh) 


+v«i 23 ) ■ v-^ns) + t(v„vy{( a:i 3))(v a v i - 1 j(i 2 3)) 


+ cy<457) 


where the inverse gradient is defined by the Fourier representation 

C k 

V^x) = —i d 3 kexp(ik • x) —P(k). (158) 

J k~ 

For scale-free initial conditions, P{k ) oc k n , £(x) oc x _ ^ ri+3 ' ) (with n < 0 for 
convergence), and thus 


£ 3 (xi,x 2 ,x 3 ) = 


10 , n+3 (r r )( X2 3 , ^13 

4 (T13 x 23 ) I-1- 

y Xl3 X 23 

3 - 2{n + 3) + (n + 3 ) 2 (xi 3 • x 23 ) 2 -i 


7 n 

4 Q L L L j Q'l 2 

^ n 2 


f0&i3)f0&23) + eye.. (159) 


Similarly to Fourier space, we can define the three-point amplitude in real 
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space 03. 


+ {(*3i){(zia)’ 

which is shown in Fig. 10 for spectral indices n = —2, —1.5, —1 (solid, dashed 
and short-dashed, respectively). Note that in real space the three-point am¬ 
plitude Q has a stronger shape dependence for spectra with more power on 
small scales (larger spectral index n ), unlike the case of Fourier space. This 
is because scales are weighted differently. Since £(x) is actually equivalent to 
k 3 P(k) rather than P(k), using £(x)/x 3 to define Q in real space rather than 
£(x) leads to a similar behavior with spectral index than in Fourier space. 
Note that for scale-free initial conditions, the three-point amplitude for equi¬ 
lateral triangles reduces to the following simple expression as a function of 
spectral index n, 

18 n 2 + 19n - 3 
7n 2 

Figure 11 shows a comparison of the tree-level PT prediction for Q 3 in AC DM 
models (lines) with the fully non-linear values of Q 3 measured in N-body 
simulations (symbols with error bars). Even on the earlier outputs (<jg = 0.5, 
left panel) corrections to the tree-level results become important at scales 
r 12 < 12 Mpc/h. At larger scales there is an excellent agreement with tree- 
level PT. This seems in contradiction with claims in [346], but note that for the 
later outputs (eg = 1.0, right panel) non-linear corrections can be significant 
at very large scales r 32 < 18 Mpc/h, so that for precision measurements one 
needs to take into account the loop corrections (see [23] for more details). 

4-2 The Transition to the Non-Linear Regime: “Loop Corrections” 

4-2.1 One-Loop PT and Previrialization 

In the previous section we discussed the leading order contribution to corre¬ 
lations functions, and found that these are given by tree-level PT, resulting 
in the linear evolution of the power spectrum and in hierarchical amplitudes 
Qn independent of the amplitude of fluctuations. Higher-order corrections 
to tree-level PT (organized in terms of “loop” diagrams) can in principle be 
calculated, but what new physics do they describe? Essentially one-loop PT 
describes the first effects of mode-mode coupling in the evolution of the power 
spectrum, and the dependence of the hierarchical amplitudes Qjy on £ 2 . It 
also gives a quantitative estimate of where tree-level PT breaks down, and 
leads to a physical understanding of the transition to the non-linear regime. 

19 In this case, however, one must be careful not to use such a statistic for scales 
near the zero-crossing of £(r) [100]. 





_ £ 3 (xi,x 2 x 3 ) 

£(2h2)£(;C23) + £(£23)£0F5l) 
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Fig. 11. Tree-level PT predictions of the three-point amplitude Q ^ in the A CDM 
model for triangle configurations given by r \ 2 /V 23 = 1 as a function of the angle 
a (fi 2 • ^23 = cos a). The different curves correspond to different triangle sides 
7*12 = 6,12,18,24 Mpc/h (from top to bottom at 9 = 0.47r). Symbols with error 
bars correspond to measurements in numerical simulations at ag = 0.5 (left panel) 
and < 7 8 = 1.0 (right panel). From [23]. 

One the main lessons learned from one-loop PT is the fact that non-linear 
growth of density and velocity fields can be slower than in linear PT, in con¬ 
trast with e.g. the spherical collapse model where non-linear growth is always 
faster than linear. This effect, is due to tidal effects which lead to non-radial 
motions and thus less effective collapse of perturbations. This was conjectured 
as a possibility and termed “previrialization” [171]; numerical simulations how¬ 
ever showed evidence in favor [677,510] and against [207] this idea. The first 
quantitative calculation of the evolution of power spectra beyond linear theory 
for a wide class of initial conditions and comparison with numerical simula¬ 
tions was done in [613], where it was shown that one-loop corrections to the 
linear power spectrum can be either negative or positive depending on whether 
the initial spectral index was larger or smaller than n ~ —1. Subsequent work 
confirmed these predictions in greater detail [428,408,558]; in particular, the 
connection between one-loop corrections to the power spectrum and previous 
work on previrialization was first emphasized in [408]. In fact, a detailed in¬ 
vestigation shows that one-loop PT predicts the change of behavior to occur 
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at n ~ —1.4 [558], and divergences appear for n > — 1 which must be cutoff 
at some small-scale in order to produce finite results. We shall come back to 
this problem below. 

In addition, one-loop corrections to the bispectrum show a very similar be¬ 
havior with initial spectral index [559,560]. For n < —1.4 one-loop corrections 
increase the configuration dependence of Q , whereas in the opposite case they 
tend to flatten it out. These results for scale-free initial conditions are rel¬ 
evant for understanding other spectra. Indeed, calculations for CDM spec¬ 
tra [27,334,560] showed that the non-linear power spectrum is smaller than 
the linear one close to the non-linear scale, where the effective spectral in¬ 
dex is n > —1. Furthermore, these results give insight into the evolution of 
CDM-type of initial spectra: transfer of power happens from large to small 
scales because more positive spectral indices evolve slower than negative ones. 
In fact, as a result, non-linear evolution drives the non-linear power spectrum 
closer to the critical index n ~ — 1 [558,14], 

4-2.2 The One-Loop Power Spectrum 

As mentioned above, one-loop corrections to power spectrum (or equivalently 
to the two-point correlation function) have been extensively studied in the 
literature [353,678,354,135,613,428,334,27, 408,558]^]] We now briefly review 


these results. 

We can write the power spectrum up to one-loop corrections as 

P(k, t) = p(°\k, t ) + P {1 \k , r) + ..., (162) 

where the superscript (n) denotes an n-loop contribution, the tree-level (0- 
loop) contribution is just the linear spectrum, 

P (0 \k,r) = [D[ +) } 2 P L (k), (163) 

and the one-loop contribution consists of two terms (see Fig. 7), 

P (1 \k, t ) = P 22 (k, t ) + P 13 (k, r), (164) 

with 

P 22 (k, t) = 2 J [F 2 (s) (k - q, q)] 2 P L (|k - q| ,r)P L (q, r)d 3 q, (165) 

Pis(k, t) = 6 J if } (k, q, -q )P L {k, r)P L (q, r)d 3 q. (166) 


Here P t] denotes the amplitude given by a connected diagram representing the 
contribution from (SiSj) c to the power spectrum. We have assumed Gaussian 

20 Multi-loop corrections to the power spectrum were considered in [237], including 
the full contributions up to 2 loops and the most important terms at large k in 3- 
and 4-loop order. 
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Table 5 

Contributions to the one-loop power spectrum as a function of spectral index n. 

initial conditions, for which P^ vanishes if i + j is odd. Note the different 
structure in the two contributions, Eq. (165) is positive definite and describes 
the effects of mode-coupling between waves with wave-vectors k — q and q, 
i.e. if Ph(k) = 0 for k > k c , then P 2 2 (A;) = 0 only when k > 2 k c . On the other 
hand, Eq. (166) is in general negative (leading to the effects of previrialization 
mentioned above) and does not describe mode-coupling, i.e. P\:${k) is propor¬ 
tional to Pi(k). This term can be interpreted as the one-loop correction to 
the propagator in Eq. (87) [569], i.e. the nonlinear correction to the standard 
a(r) linear growth. 

The structure of these contributions can be illustrated by their calculation for 
scale-free initial conditions, where the linearly extrapolated power spectrum 
is Ph(k) = Aa 2 k n , shown in Table 5. The linear power spectrum is cutoff 
at low wavenumbers (infrared) and high wavenumbers (ultraviolet) to control 
divergences that appear in the calculation; that is, Pl(^) = 0 for k < e and 
k > k c . These results assume k 3> e and k <C k c , otherwise there are additional 
terms [428,558]. 

The general structure of divergences is that for n < —l there are infrared 
divergences that are caused by terms of the kind / P(q)/q 2 d 3 q ; these are can¬ 
celled when the partial contributions are added. In fact, it is possible to show 
that this cancellation still holds for leading infrared divergences to arbitrary 
number of loops [336]. It was shown in [557] that this cancellation is general, 
infrared divergences arise due to the rms velocity held (whose large-scale limit 
variance is / P(q)/q 2 d 3 q), but since a homogeneous how cannot affect equal- 
time correlation functions because of Galilean invariance of the equations of 
motion, these terms must cancel at the end. 

Ultraviolet divergences are more harmful. We see from Table 5 that as n > —1 
the P 13 contribution becomes ultraviolet divergent (and when n > 1 for P 22 as 
well), but in this case there is no cancellation. Thus, one-loop corrections to the 
power spectrum are meaningless at face value for scale-free initial conditions 
with n > —1. Furthermore, one-loop corrections to the bispectrum are also 
divergent for scale-free initial spectra as n —> — 1. Of course, it is possible that 
these divergences are cancelled by higher-order terms, but to date this has not 
been investigated. This seems a rather academic problem, since no linear power 
spectrum relevant in cosmology is scale-free, and for CDM-type spectra there 
are no divergences. On the other hand, understanding this problem may shed 
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Fig. 12. One-loop corrections to the power spectrum of the density field as a function 
of spectral index [see Eq. (169)]. Also shown is the one-loop corrections to the veloc¬ 
ity divergence power spectrum, ag(n). Note that non-linear effects can slow down 
the growth of the velocity power spectrum for a broader class of initial conditions 
than in case of the density field. 

light on aspects of gravitational clustering in the transition to the non-linear 
regime. 

To characterize the degree of non-linear evolution when including one-loop 
corrections, it is convenient to define a physical scale from the linear power 
spectrum, the non-linear scale Ro, as the scale where the smoothed linear 
variance is unity, 

aj(R 0 ) = f d 3 k P L (k, r) W 2 (kR 0 ) = 1. (167) 

For scale-free initial conditions and a Gaussian filter, W(x) = exp(—x 2 /2), 
Eq. (167) gives 7 ?q +3 = 2TtAa 2 Y[{n + 3)/2]. This is related to the non-linear 
scale defined from the power spectrum, A (k n i) = Aitk^Pikni) = 1 by 

k nl R 0 = Y[(n + 5)/2]. (168) 


Figure 12 displays the one-loop correction to the power spectrum in terms of 
the function ag(n) defined by 


m 


2(kR 0 ) n+3 

r[(n + 3)/2] 


1 + a 5 (n) (. kR 0 ) n+3 , 


(169) 


which measures the strength of one-loop corrections (and similarly for the 
velocity divergence spectrum replacing ag by ceg). This function has been cal- 
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Fig. 13. The power spectrum for n = —2 scale-free initial conditions. Symbols denote 
measurements in numerical simulations from [560]. Lines denote linear PT, one-loop 
PT [Eq. (169)] and the Zel’dovich Approximation results [Eq. (181)], as labeled. 


culated using the technique of dimensional regularization in [558] (see Ap¬ 
pendix D for a brief discussion of this). From Fig. 12 we see that loop cor¬ 
rections are significant with as close to unity or larger for spectral indices 
n < —1.7. For n c m —1.4 one-loop corrections to the power spectrum vanish 
(and for the bispectrum as well [559]). For this “critical” index, tree-level PT 
should be an excellent approximation. One should keep in mind, however, that 
the value of the critical index can change when higher-order corrections are 
taken into account; particularly given the proximity of n c to n — —1 where 
ultraviolet divergences drive a —> — oo. On the other hand, recent numerical 
results agree very well with n c ~ —1-4, at least for redshifts z ~ 3 evolved 
from CDM-like initial spectra [702], 

Figure 12 also shows the one-loop correction coefficient ag for the velocity 
divergence spectrum. We see that generally velocities grow much slower than 
the density held when non-linear contributions are taken into account. For 
n > —1.9 one-loop PT predicts that velocities grow slower than in linear PT. 
Although this has not been investigated in detail against numerical simula¬ 
tions, the general trend makes sense: tidal effects lead to increasingly non- 
radial motions as n increases, thus the velocity divergence should grow in¬ 
creasingly slower than in the linear case. 

Figure 13 compares the results of one-loop corrections for n = —2 against 
numerical simulations, whereas the top left panel in Fig. 14 shows results for 
n = —1.5. In both cases we see very good agreement even into considerably 
non-linear scales where A (k) ~ 10 — 100, providing a substantial improvement 
over linear PT. Also note the general trend, in agreement with numerical 
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Fig. 14. The left top panel shows the non-linear power spectrum as a function 
of scale for n = —1.5 scale-free initial conditions. Symbols denote measurements 
in numerical simulations, whereas lines show the linear, and the fitting formulas 
of [335,494] and one-loop perturbative results, as indicated. The other three pan¬ 
els show the reduced bispectrum Q for triangle configurations with k\/k 2 = 2, 
as a function of the angle 6 between ky and k 2 , in numerical simulations and for 
tree-level and one-loop PT. The panels correspond to stages of non-linear evolution 
characterized by A(fci). Taken from [560]. 

simulations, that non-linear corrections are significantly larger for n = —2 
than for n = —1.5. 

4-2.3 The One-Loop Bispectrum 

The loop expansion for the bispectrum, B = B^ + B W + ..., is given by the 
tree-level part B ® in terms a single diagram from second-order PT (see Fig. 5) 
plus its permutations over external momenta (recall that k x + k 2 + k 3 = 0) 

B^ = 2P L (k 1 )P L (k 2 )F^\k 1 ,k 2 ) + 2P L (k 2 )P L (k 3 )F^\k 2 ,k 3 ) 

+2P L (k 3 )P L (k 1 )F^ s \k 3 ,k 1 )- (170) 

The one-loop contribution consists of four distinct diagrams involving up to 
fourth-order solutions [559,560] 

B ^ = B 222 + -B 321 + -S 3 21 + S 411 , (171) 
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where: 


B 222 = 8 J d 3 qP L (g,r)F 2 (s) (-q,q +ki)P L (|q +ki|,r) 

xP 2 (s) (-q- ki,q- k 2 )P L (|q- k 2 |,r)P 2 (A) (k 2 - q,q), (172) 

£321 = 6 P L (/c 3 ,r) J d 3 qP L (g,r)P 3 (s) (-q,q-k 2 ,-k 3 )P L (|q-k 2 |,r) 

xP 2 (s) (q, k 2 — q) + permutations, (173) 

B 32 i = 6 Pl(A: 2 , r)P L (k 3 , r)P 2 (A, (k 2 , k 3 ) J d 3 q P L (q, r)P 3 (A) (k 3 , q, -q) 

+permutations, (174) 


-B411 = 12Pl(/c 2 ,t)Pl(A: 3 ,t) J d 3 qP L (g, r)p[ A) (q, -q, -k 2 , -k 3 ) 
+permutations. 


(175) 


For the reduced bispectrum Q [see Eq. (154)], the loop expansion yields: 


A _ p(°) + pw + ... 
^ = S(°) + Ed) + ... ’ 


(176) 


where £ (0) = P L (k 1 )P L (k 2 ) + PL(k 2 )P L (k 3 ) + P L (A; 3 )P L (A; 1 ), and its one-loop 
correction Z7 1 ) = P^ (^ki) P < ' 1 \k 2 ) -f- permutations (recall P^ = P^). For large 
scales, it is possible to expand Q = Q® + + ..., which gives: 


6(0) = PI od) = PI z PPk 

v - S(°) V - E<“> 


(177) 


Note that Q (l> depends on the normalization of the linear power spectrum, 
and its amplitude increases with time evolution. For initial power-law spectra 
P^(/c) = Aa 2 k n with n = —2 the calculation using dimensional regularization 
(see Appendix D) yields a closed form; otherwise the result can be expressed 
in terms of hypergeometric functions of two variables [559] or computed by 
direct numerical integration [560]. 

Figure 14 shows the predictions of one-loop PT compared to N-body simula¬ 
tions for scale-free initial conditions with n = —1.5. In the top right panel, we 
see that the predictions of Eq. (177) agree very well with simulations at the 
nonlinear scale. In the bottom panels, where A > 1, we have used Eq. (176) 
instead of Eq. (177). At these scales Eq. (176) saturates, that is, the one-loop 
quantities B ^ and dominate over the corresponding tree-level values and 
further time evolution does not change much the amplitude Q, because P ( b 
and E :< P have the same scale and, by self-similarity, time-dependence. At even 
more non-linear scales, simulations show that the configuration dependence of 
the bispectrum is completely washed out [560]. 
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Fig. 15. One-loop bispectrum predictions for equilateral configurations for scale-free 
spectra with n = —2, Eq. (178), and n = —1.5, Eq. (179), against N-body simula¬ 
tions measurements from [560]. Error bars come from different output times, assum¬ 
ing self-similarity, see Sect. 4.5.1. This might not be well obeyed for n = —2, due 
to the importance of finite-volume effects for such a steep spectrum, particularly at 
late times, see [418] and discussion in Sect. 6.12.1. 

Using the one-loop power spectrum for n = —2 given in Table 5, P ( - 1 \k ) = 
A 2 a 4 557r 3 /(98fc), Q ^ follows from Eq. (177). The calculation can be done 
analytically [559]; for conciseness we reproduce here only the result for equi¬ 
lateral configurations, 

4 1426697 

° Be= 7 + 3863552 ,r A ' i?0 = °' 57|1 + 3 ' 6<:Bo1 ’ (n = ~ 2 > < 178 ) 

and for n — —1.5 we have from numerical integration [560] 

Qeq = j + 1.32(fci? 0 ) 3 / 2 = 0.57[1 + 2.316 (kR Q f\ (n = -1.5) (179) 

Figure 15 compares these results against N-body simulations. We see that 
despite the strong corrections, with one-loop coefficients larger than unity, 
one-loop predictions are accurate even at kR 0 = 1. As we pointed out before, 
many of the scale-free results carry over to the CDM case taking into account 
the effective spectral index. Figure 16 illustrates the fact that one-loop cor¬ 
rections can increase quite significantly the configuration dependence of the 
bispectrum at weakly non-linear scales (left panel) when the spectral index is 
n < —2, in agreement with numerical simulations. On the other extreme, in 
the highly non-linear regime (right panel), the bispectrum becomes effectively 
independent of triangle shape, with amplitude that approximately matches 
that of colincar amplitudes in tree-level PT. 

Based on results from N-body simulations, it has been pointed out in [234] 
(see also [240]) that for n = — 1 nonlinear evolution tends to “wash out” the 
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Fig. 16. The left panel shows the one-loop bispectrum predictions for CDM model 
at scales approaching the non-linear regime, for kyjhi = 2 and A ~ 1 (left) against 
numerical simulations [560]. The right panel shows the saturation of Q at small 
scales in the highly non-linear regime, for two different ratios for k\ / k? = 2,3 and 
A > 100 [563]. Dashed lines in both panels correspond to tree-level PT results. 

configuration dependence of the bispectrum present at the largest scales (and 
given by tree-level perturbation theory), giving rise to the so-called hierarchical 
form Q se const in the strongly non-linear regime (see Sect. 4.5.5). One-loop 
perturbation theory must predict this feature in order to be a good description 
of the transition to the nonlinear regime. In fact, numerical integration [559] 
of the one loop bispectrum for different spectral indices from n = —2 to 
n — — 1 shows that there is a change in behavior of the nonlinear evolution: 
for n < —1.4 the one-loop corrections enhance the configuration dependence 
of the bispectrum, whereas for n > —1.4, they tend to cancel it, in qualitative 
agreement with numerical simulations. Note that this “critical index” n c ~ 
— 1.4 is the same spectral index at which one-loop corrections to the power 
spectrum vanish, marking the transition between faster and slower than linear 
growth of the variance of density fluctuations. 

4-3 The Power Spectrum in the ZeVdovich Approximation 

The Zel’dovich approximation (ZA, [705]) is one of the rare cases in which 
exact (non-perturbative) results can be obtained. However, given the drastic 
approximation to the dynamics, these exact results for the evolution of clus¬ 
tering statistics are of limited interest due to their restricted regime of validity. 
The reason behind this is that in the ZA when different streams cross they 
pass each other without interacting, because the evolution of fluid elements is 
local. As a result, high-density regions become washed out. Nonetheless, the 
ZA often provides useful insights into non-linear behavior. 

For Gaussian initial conditions, the full non-linear power spectrum in the 
ZA can be obtained as follows [77,430,556,220,642], Changing from Eule- 
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rian to Lagrangian coordinates, the Fourier transform of the density held is 
<5(k) = / d 3 qexp[?'k- (q+T)], where T(q) is the displacement held. The power 
spectrum is thus 

P(k) = /d 3 gexp(?'k ■ q) (exp(ik • AT)), (180) 


where AT = T(qi) — T(q2) and q = qi — q 2 - For Gaussian initial conditions 
the ZA displacement is a Gaussian random held, so Eq. (180) can be evaluated 
in terms of the two-point correlator of T(q). An analytic result for the power 
spectrum in the ZA has been obtained in [642] for scale-free initial conditions 
with —3 < n < —1. For n = —2 it is 


A(k) 
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(181) 


where the non-linear wavenumber obeys A^(/c n ;) = 1. This result is shown 
in Fig. 13 (note that in the figure we use f? 0 to characterize the non-linear 
scale, k n iRo = T[(n + 5)/2]), together with the prediction of one-loop PT, 
linear theory and measurements in N-body simulations (symbols with error 
bars). Clearly the lack of power at small scales due to shell-crossing makes the 
ZA prediction a poor description of the non-linear power spectrum. Attempts 
have been made in the literature to truncate the small-scale power in the 
initial conditions and then use ZA [138], this improves the cross-correlation 
coefficient between ZA and IV-body simulation density fields [138,106,455] but 
it does not bring the power spectrum into agreement [106,455]. Similar results 
for the effect of shell crossing on the power spectrum hold for 2LPT and 3LPT, 
see e.g. [106,455,367], 


4-4 Non-Gaussian Initial Conditions 


4-4-1 General Results 

So far we have discussed results for Gaussian initial conditions. When the 
initial conditions are not Gaussian, higher-order correlation functions are non¬ 
zero from the beginning and their evolution beyond linear PT is non-trivial [238] 
Here we present a brief summary of the general results for the power spectrum 
and bispectrum, in the next section we discuss the application to the y 2 model, 
for which correlation functions beyond linear perturbation theory have been 
derived [565]. This belongs to the class of dimensional scaling models, in which 
the hierarchy of initial correlation functions obey Gv ~ £, 2 ^ ■ Another dimen¬ 
sional scaling model that has been studied is the non-linear a -model [333]. 
In addition, hierarchical scaling models, where Gv ~ as generated by 

gravity from Gaussian initial conditions, have been studied in [414,670]. Most 
quantitative studies of non-Gaussian initial conditions, however, have been 
done using one-point statistics rather than correlation functions, we review 



them in Sect. 5.6. 

It is worth emphasizing that the arguments developed in this section (and 
in Sect. 5.6) are valid only if the history of density fluctuations can be well 
separated into two periods, (i) imprint of non-Gaussian initial fluctuations at 
very early times, where oq <C 1, and then (ii) growth of these fluctuations due 
to gravitational instability. This is a good approximation for most physically 
motivated non-Gaussian models. 

Let us consider the evolution of the power spectrum and bispectrum from 
arbitrary non-Gaussian initial conditions^]. The first non-trivial correction to 
the linear evolution of the power spectrum involves second-order PT, since 
(5 2 ) = ((hi + h 2 + • • -) 2 ) ~ (^i) + 2(hih 2 ) + ...; the second term which vanishes 
for the Gaussian case (since (hih 2 ) ~ (hf)) leads instead top 7 ! 

P(k) = P\k) + 2 J d 3 q F 2 { k + q, -q) B J (k, q), (182) 

which depends on the initial bispectrum B 1 , and similarly for the non-linear 
evolution of the bispectrum 

d °3 = B i 23 + B 123 + / d3< l e.(ki + k 2 - q, q) i 5 /(k 1 ,k 2 ,k 1 + k 2 - q.q), 

(183) 

where B [ 23 denotes the contribution of the initial bispectrum, scaled to the 
present time using linear PT, P( 23 (r) oc [D{ + (t)] 3 , B^ 23 represents the usual 
gravitationally induced bispectrum, Eq. (155), and the last term represents 
the contribution coming from the initial trispectrum linearly evolved to the 
present, P/ given by 

(i'(k 1 )i , (k 2 )i'(k 3 )i'(k 4 )> c = folk, + k 2 + k 3 + k,) P/(k!, k 2 , k 3 , k,). 

(184) 

Clearly, the complicated term in Eq. (183) is the last one, which involves a con¬ 
volution of the initial trispectrum with the second-order PT kernel P 2 (ki, k 2 ). 
Note that only the first term scales as [Pj^r)] 3 , the last two terms have the 
same scaling with time, [Pj^r)] 4 , and therefore dominate at late times. The 
structure of these contributions is best illustrated by considering a specific 
model, as we now do. 

4-4-2 x 2 Initial Conditions 

An example that shows how different the bispectrum can be in models with 
non-Gaussian initial conditions, is the chi-squared model [513,514], There are 

21 See [672] for a recent study of the trispectrum for non-Gaussian initial conditions. 

22 See Sect. 5.6 for additional explanation of the new contributions that appear due 
to primordial non-Gaussianity. 


69 



in fact a number of inflationary models in the literature that motivate y 2 initial 
conditions [380,7,404,512]. It is also possible that this particular model may be 
a good representation of the general behavior of dimensional scaling models, 
and thus provide valuable insight. In this case, the density held after inflation 
is proportional to the square of a Gaussian scalar held 0(x), p(x) oc 0(x) 2 . 
The initial correlations are easiest calculated in real space [514] 


£ = 2 


a(r) 


a. 


4 ’ 


(l =2 3/2 v'TTidGTTT), 


(185) 

(186) 


s = 4 


7 fl (Gjil (rzs)(j (r 34)^2 (hi) + 7 (Tiff (r7)ff(m) + 


\/d(7Tdmnidin 


(187) 


where = r,; — iq). However, non-linear corrections are more difficult to cal¬ 
culate in real space [238], so we turn to Fourier space. The initial density power 
spectrum and bispectrum read (a similar expression holds for the trispectrum, 
see [565]), 

P\k) = 2 J d 3 q PMPA Ik - q|), (188) 

B\k i, k 2 , k 3 ) = 12 J d 3 q P^(g)P^(|k x - q|)P^,(|k 2 + q|), (189) 


where P^k) denotes the power spectrum of the 0 held. For scale-free spectra, 
P^k) (x k n *, P J (A;) oc /c 2n ^> +3 , with amplitude calculable in terms of Gamma 
functions; similarly the bispectrum can be expressed in terms of hypergeo¬ 
metric functions [565]. To calculate the hierarchical amplitude to tree-level 
we also need the next-to-leading order evolution of the power spectrum, that 
is Eq. (182), which depends on the initial bispectrum, Eq. (189). A simple 
analytic result is obtained for the particular case, P^ik) = Ak~ 2 , not too far 
from the “canonical” = —2.4 (e.g. giving n = —1.8, [513,514]), then [565] 
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Defining the non-linear scale k n i from the linear power spectrum as usual, 
47rfc 3 jP i,{kni) = k-Likni) = 1, it follows that 
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fkV / 24 k\ 

\kni) V + 7\/27r k rd ) ’ 


(191) 


Then the tree-level hierarchical amplitude reads [565], 
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Fig. 17. The reduced bispectrum Q for triangles with sides k\ = 0.068 h/Mpc and 
k r 2 = 2ki as a function of the angle 6 between ki and ko (left panel). Right panel 
shows Q for equilateral triangles as a function of scale k. Triangles denote linear 
extrapolation from y 2 initial conditions, whereas square symbols show the result of 
non-linear evolution. Dot-dashed lines show the predictions of non-linear PT from 
Gaussian initial conditions with the same initial power spectrum as the y 2 model. 

~ _ 4a/2 k n i 192 Aq&2 + k 2 k 3 + .=g .a t D \ 

Ql23 - — hThTh ~ w- (hThTW 2 Qn3 0l23( 4) ’ 

(192) 


where Qf 23 denotes the hierarchical amplitude obtained from Gaussian ini¬ 
tial conditions, and denotes the contribution from the last term in 

Eq. (183) which is difficult to calculate analytically. In particular, for equilat¬ 
eral configurations Q r eq = (4v^2/37t )(k n i/k). On the other hand, for Gaussian 
initial conditions, Q G q = 4/7 independent of spectral index; similarly there is 
a contribution from non-Gaussian initial conditions that is scale independent, 
5Q eq = —64/77T 2 . Since < 3123 (^ 4 ) is also independent of scale, it turns out that 
the signature of this type of non-Gaussian initial conditions is that Q 123 shows 
a strong scale dependence at large scales as k/k n i 0. This is not just a pecu¬ 
liar property of this particular model, but rather of any non-Gaussian initial 
conditions with dimensional scalingp 3 ]. Note also that Q 1 shows, in some sense, 
the opposite configuration dependence from Q G , for triangles where k\fk 2 = 2 
as in Fig 9, Q 7 (0) is an increasing function of 6, as expected from the scale 
dependence, in particular Q 1 (tt) / Q 1 (0) = 3/2. 

Figure 17 shows the results of using 2LPT (see Sect. 2.7) evolved from y 2 initial 
conditions [565]. The auxiliary Gaussian held </> was chosen to have a spectral 
index = —2.4, leading to n = —1.8 as proposed in [513]. The amplitude of 
the power spectrum has been chosen to give k n i = 0.33 h/Mpc. The dashed 


23 See Sect. 5.6 for a more detailed discussion of this point and its generalizations. 


71 









lines in Fig. 17 (left panel) show the predictions of the first term in Eq. (192) for 
the reduced bispectrum at k\ = 0.068 h/Mpc, k 2 = 2Aq, as a function of angle 
6 between kj and k 2 . This corresponds to n = —1, however, it approximately 
matches the numerical results (triangles, n = —1.8). The latter show less 
dependence on angle, as expected because the scale dependence in the n = 
—1.8 case ( Q 1 oc k~ °' 6 ) is weaker than for n = —1 (Q 1 oc k~ 1 ). The right panel 
in Fig. 17 shows equilateral configurations as a function of scale for y 2 initial 
conditions (triangles) and Qi q (k) = 0.8 (k/k n i)~ 06 (dashed lines), where the 
proportionality constant was chosen to fit the numerical result, this is slightly 
larger than the prediction in the first term of Eq. (192) for n = — 1 equilateral 
configurations, and closer to the real-space result Q eq (x) = 0.94 (x/x n i) 0 ' 6 - 
The behavior of the y 2 bispectrum is notoriously different from that generated 
by gravity from Gaussian initial conditions for identical power spectrum (dot- 
dashed lines in Fig. 17) [225]. The structures generated by squaring a Gaussian 
held roughly correspond to the underlying Gaussian high-peaks which are 
mostly spherical, thus the reduced bispectrum is approximately hat. In fact, 
the increase of Q 1 as 6 —> tt seen in Fig. 17 is basically due to the scale 
dependence of Q 1 , i.e. as 6 —> ir, the side k 3 decreases and thus Q 1 increases. 
As shown in Eq. (192), non-linear corrections to the bispectrum are significant 
at the scales of interest, so linear extrapolation of the initial bispectrum is 
insufficient to make comparison with current observations. The square symbols 
in left panel of Fig. 17 show the reduced bispectrum after non-linear corrections 
are included. As a result, the familiar dependence of on the triangle 
shape due to the dynamics of large-scale structures is recovered , and the scale 
dependence shown by Q 1 is now reduced (right panel in Fig. 17). However, 
the differences between the Gaussian and y 2 case are very obvious: the y 2 
evolved bispectrum has an amplitude about 2-4 times larger than that of an 
initially Gaussian held with the same power spectrum. Furthermore, the y 2 
case shows residual scale dependence that rehects the dimensional scaling of 
the initial conditions. These signatures can be used to test this model against 
observations [225,567,211], as we shall discuss in Sect. 8. 

4-5 The Strongly Non-Linear Regime 

In this section we consider the behavior of the density and velocity helds in 
the strongly non-linear regime, with emphasis on the connections with PT. 
Only a limited number of relevant results are known in this regime, due to the 
complexity of solving the Vlasov equation for the phase-space density distribu¬ 
tion. These results, based on simple arguments of symmetry and stability, lead 
however to valuable insight into the behavior of correlations at small scales. 

4-5.1 The Self-Similar Solution 

The existence of self-similar solutions relies on two assumptions within the 
framework of collisionless dark matter clustering, 


72 



(1) There are no characteristic time-scales, this requires Q m = 1 where the 
expansion factor scales as a power-law, a rv./ f 2 / 3 . 

(2) There are no characteristic length-scales. This implies scale-free initial 
conditions, e.g. Gaussian with initial spectrum Pi{k) ~ k n . 

Since gravity is scale-free, there are no scales involved in the solution of the 
coupled Vlasov and Poisson equations. As a result of this, the Vlasov equation 
admits self-similar solutions with [171] 

/(x, p, t) = t~ 3 ~ 3a f (x/f p/f 3+1/3 ) , (193) 

where (3 = a + 1/3 and t is cosmic time. Integration over momentum leads 
to correlation functions that are only functions of the self-similarity variables 
Si = Xj/t Q , in particular the two-point correlation function reads, 

«x, *) = /„(£), (194) 

and similarly for higher-order correlation functions, e.g. £ 3 (xi, X 2 , X 3 ,t) = 
/s(si, S 2 , S 3 ). Note that this solution holds in all regimes, from large to small 
scales. Using the large-scale behavior expected from linear PT, it is then pos¬ 
sible to compute the index a , requiring that £l(x, a) ~ a 2 x~ ( ' n+3 ' ) be a function 
only of the self-similarity variable xt~ a leads to 

4 

a = — - 

3(n + 3) 

Note that the self-similar scaling of correlation functions can also be obtained 
from the fluid equations of motion [558], as expected since only symmetry 
arguments (which have nothing to do with shell crossing) are involvedp 1 ~|. Self¬ 
similarity reduces the dimensionality of the equations of motion; it is possible 
to achieve further reduction by considering symmetric initial conditions, e.g. 
planar, cylindrical or spherical. In these cases, exact self-similar solutions can 
be found by direct numerical integration, see e.g. [214,60]. Although this pro¬ 
vides useful insight about the non-linear behavior of isolated perturbations, 
it does not address the evolution of correlation functions. Detailed results for 
correlation functions in the non-linear regime can however be obtained by 
combining the self-similar solution with stable clustering arguments, as we 
now discuss. 

4-5.2 Stable Clustering 

Stable clustering asserts that at small scales, high-density regions decouple 
from the Hubble expansion and their physical size is stable, i.e. it does not 

24 For n = —2, where finite volume effects become very important, self-similarity 
has been difficult to obtain in numerical simulations. However, even in this case 
current results show that self-similarity is obeyed [338]. 
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Fig. 18. The ratio of the mean pair (peculiar) velocity to the Hubble velocity, 
—u/Hx, as a function of the mean correlation function £ av for a CDM model. 
The pair conservation equation is used to solve for —u/Hx using the evolution 
of £av(a, x). The three curves are for a = 0.3, 0.6,0.8. They would coincide for a 
scale-free spectrum. They seem to approach the stable clustering value —u/Hx = 1 
for £ av > 200. Taken from [337]. 

change with time [171]. This implies that the relative motion of particles within 
gravitationally bound structures should compensate on average the Hubble ex¬ 
pansion. Following this idea general relations can be obtained for the behavior 
of the two-point correlation function from the continuity equation alone. In¬ 
deed, from Eq. (16) it follows that 

^ = |;((l + yx 1 ))(l + i(x 2 ))> 

= (-Vi[(l + 5 (x 1 ))u(x 1 )](l + <$(x 2 ))) 

- ((1 + <5(xi))V 2 [(1 + <5(x 2 ))u(x 2 )] ). (196) 

Pulling out the derivatives using statistical homogeneity, we arrive to the pair 
conservation equation [171] 

^ + v 12 -[u 12 (i + 6 2 )] = 0, 

where the pairwise velocity is defined as 

= ((1 + J(xi))(l + a(x 2 ))(u( Xl ) - u(x 2 )) ) 

12 ((1 + <5( Xl ))(l + <5(x 2 )) ) 

In the non-linear regime, £ 1, stable clustering implies that the pairwise 



(197) 
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velocity exactly cancels the Hubble flow, U 12 = —7dxi2. Under this assumption, 
Eq. (197) can be readily solved to yield 


€(z,t) « 1 + £(x,t) = a 3 (r)/ 2 (ai), 


(199) 


which means that the probability of having a neighbor at a fixed physical 
separation, Eq. (128), becomes independent of time. Equation (197) can be 
rewritten as, 


Uy 2 (x) _ 1 <9£av(x) 

TLx 3(l + £(x)) cHna 


( 200 ) 


which shows that the pairwise velocity is intimately related to the behavior of 
the two-point function. Here we defined the average two-point function as 


X 



( 201 ) 


0 


and U 12 is the norm of u 12 that can only be along the x 2 — x x direction. 

From Eq. (200) it follows that if the time evolution is modeled as following 
linear PT, then the rhs becomes 2/£ av /3. As £ av > 1, £ av grows faster than lin¬ 
ear theory and thus pairwise velocities overcompensate the Hubble flow; this 
leads to the well-known “shoulder” (a sudden increase of slope) in the two- 
point correlation function [271]. These regimes are illustrated in Fig. lSp 7 ]. 
From Eq. (200) it is also clear that a way to model the evolution of the two- 
point correlation function is by modeling the dependence of pairwise velocities 
on £ av [289,479,358,213,112], The analysis of high resolution N-body simula¬ 
tions [358] run by the Virgo Consortium [342] show that the slope of £ 2 (r) 
indeed exhibits a “shoulder” in the form of an inflection point d 2 ^ 2 (r)/dr 2 = 0 
at separation r* close to the correlation length where $ 2 (ro) = 1- This 
property has been recently corroborated for different initial power-spectrum 
shapes [260]. The equality between r* and r 0 is related to the fact that loop 
corrections become important close to the non-linear scale in CDM models at 
z — 0, giving rise to a change in slope. For models where the spectral index 
at the non-linear scale is very negative (such as CDM models at high redshift, 
z ~ 3, see e.g. [702]), loop corrections can be very large (see Fig. 12), and 
the non-linear scale r 0 can be much smaller than that where loop corrections 
become important (related to r*). 

A similar approach can be used to obtain the behavior of higher-order corre¬ 
lation functions under additional stable clustering conditions [508,337]. The 
starting point is again the continuity equation, Eq. (16), and for the three- 


25 See [244] for a recent study of the time dependence of the pairwise velocity in the 
non-linear regime due to merging. 


75 





( 202 ) 


point case we have 


dhi-23 

dr 


( Vi ' (A 12 3 U!) + V 2 ' (A 123 U 2 ) + V 3 ' (A 123 U 3 ) ), 


where Ai 23 = (1 + <5(xi))(l + J(x 2 ))(l + <f(x 3 )) and h 12 z = {A 12 z) = 1 + 62 + 
£,23 + £31 + £ 123 . Analogous calculations to the two-point case show that 

dh 

0 + V 12 • (wi 2j 3 h 123 ) + V 23 • (w 23j i hi 23 ) = 0, (203) 


where 


Wl 2 ,3 


( Ai 2 3 (Ui — U 2 ) ) 
hl-23 


(204) 


and similarly for w 231 . Note that these three-body weighted pairwise veloci¬ 
ties are actually three-point quantities [337], since a third object is involved, 
so they are different from Eq. (198). However, in the same spirit as in the 
two-point case, if we assume that stable clustering leads to w^y = —7-fxy in¬ 
dependently of the position of object k, it follows that the solution of Eq. (203) 
is 


£ 3 ( x i,x 2 ,x 3 ) « h , 23 = a 6 (r)/ 3 (axi,ax 2 ,ax 3 ), (205) 

and thus the probability of having two neighbors at a fixed physical separation 
ax 12 and ax 23 from a given object at x 2 , becomes independent of time [e.g. 
see Eqs. (128-129)]. Similar results hold for higher-order 77-point correlation 
functions £,y [508], and imply that £n /£2 as a function of physical separation 
become independent of time in the highly non-linear regime (1 -C £2 ... <C 

£n). Note however that the additional stability conditions such as wi 2j3 rs 
— 7dxi 2 have not been so far tested against numerical simulations. 


4-5.3 Scale Invariance 

The joint use of stable clustering arguments and the self-similar solution leads 
to scale-invariant correlation functions in the non-linear regime, with precise 
predictions for the power-law indices. Equations (194) and (199) impose that 
/ 2 (x) follows a power law in x , 

f (a) ~ a ; -7 (206) 


and matching the time dependences it follows that 

6 3(n + 3) 

7 “ 3a+ 2 ““ (n + 5) ' 


(207) 


Thus, self-similarity plus stable clustering Exes the full time and spatial depen¬ 
dence of the two-point correlation function in the non-linear regime in terms 
of the initial conditions [171]. 
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A simple generalization of this argument is to assume that in the non-linear 
regime u 12 = — /i77x 12 , where h is some constant, not necessarily unity. In this 
case, Eq. (199) becomes £(x, r) = a 3h (r)/ (a h x) , and this leads to 7 = 3 h(n + 
3)/[2 + h(n+3)] [485,697]. Interestingly, if h{n+ 3) is a constant independent of 
spectral index n, then the slope of the two-point correlation function becomes 
independent of initial conditions^} Current scale-free simulations do not see 
evidence for a spectral index dependence of the asymptotic value of pairwise 
velocities and are in reasonable agreement with stable clustering [150,337,164], 
although the dynamic range in the highly non-linear regime is still somewhat 
limited. For a different point of view see [486]. 

The behavior of the higher-order correlation functions can similarly be con¬ 
strained. Since stable clustering implies that Qn ~ ^n/^ 2^ 1 is independent 
of time, adding self-similarity leads to Qn being independent of overall scale 
as well; this leads to a scaling relation for higher-order correlations that can 
be formulated in general as, 

6 v(Ax!,..., Axat) = A _(Ar_1)7 £jv(x!, ...,xjv), (208) 

where 7 is the index of the two-point function, Eq. (207). As a result, self¬ 
similarity plus stable clustering does not fix completely the behavior of the 
three-point and higher-order correlation functions. Although Qn does not de¬ 
pend on the overall scale, it does in principle depend on the configuration of 
the N points, i.e. it can depend on ratios such as X 12 /X 23 . This is the same as 
in tree-level PT, where Q 3 depends on the triangle shape (Figs. 9 and 10). 
We should at this point reconsider the results in this section from the point 
of view of the dynamics of gravitational instability. The equations of motion 
for the two and three-point correlation functions, Eqs. (197) and (203), which 
express conservation of pairs and triplets, were obtained from the equation of 
continuity alone. These are rigorous results. The validity of self-similarity is 
also rigorous for scale-free initial conditions in a fl m = 1 universe. On the other 
hand, the conditions of stable clustering are only a (physically motivated) 
ansatz, and they replace what might be obtained by solving the remaining 
piece of the dynamics, i.e. momentum conservation, in the highly non-linear 
regime. Note however that the conditions of stable clustering can only be part 
of the story for higher-order correlation functions since these do not explain 
why e.g. Q 3 tends to become constant independent of triangle configuration 
in the non-linear regime. 


26 A more detailed analysis of the BBGKY hierarchy shows that, in the absence 
of self-similarity, power-law solutions for the two-point function in the non-linear 
regime exist, but their relation to the initial spectral index depends on h, the scaling 
of £3 in terms of £2 an d the skewness of the velocity distribution. Furthermore, 
perturbations away from self-similarity may not be stable [542,697,698]. 
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4-5-4 The Non-Linear Evolution of Two-Point Statistics 
Self-similarity gives a powerful constraint on the space and time evolution of 
correlation functions, by requiring that these depend only of the self-similarity 
variables. However, different initial spectra can lead to very different functions 
of the self-similarity variables. Hamilton et al. [289] suggested a useful way of 
thinking about the non-linear evolution of the two-point correlation function, 
by which the evolution from different initial spectra can all be described by 
the same (approximately) universal formula, obtained empirically by fitting 
to numerical simulations. 

The starting point is conservation of pairs, Eq. (197), which implies 

Z 3 (l + fav)] , (9[a; 3 (l + £av)] 

-y-r U 12-y- 

or ox 

Thus, a sphere of radius x such that x 3 (l + £ av ) = x\ is independent of time 
will contain the same number of neighbors throughout non-linear evolution. At 
early times, when fluctuations are small, Xl ~ x\ as clustering develops and 
becomes non-linear, x becomes smaller than x l- This motivated the ansatz 
that the non-linear average two-point correlation function at scale x should 
be a function of the linear one at scale Xl [289] 


(209) 


£av(U T) = map[f,a.vh{x L , t)], (210) 

where the mapping E map was assumed to be universal, i.e. independent of 
initial conditions. Using more recent numerical simulations [335] showed that 
there is a dependence of jF map on spectral index (particularly as n < — 1); 
in addition [493] extended the mapping above to the power spectrum and 
arbitrary and Ha. In this case, the non-linear power spectrum at scale k is 
assumed to be a function of the linear power spectrum at scale /c L , such that 
k — [1 + A(k)] 1 ^ 3 k L , where A (k) = rk 3 P(k), 

A (fc,r) =^ r n ,n mi n A [A(fcL,r)], (211) 


where it is emphasized that the mapping depends on spectral index and cos¬ 
mological parameters. Several groups have reported improved fitting formulae 
that take into account these extra dependences [335,30,494], In the most often 
used version, the fitting function T m ap contains 5 free functions of the spec¬ 
tral index n which interpolate between T mav {x) ~ x in the linear regime and 
J- map ~ x 3 ^ 2 in the non-linear regime where stable clustering is assumed to 
hold [494] 


imap (x) 


1 + B/3x + [Ax] af3 -I i//3 


Ll + [(Ax)' a g 3 (n)/(Vx 1 / 2 )]hi 


( 212 ) 


where A = 0.482(1 + n/3)" 0 - 947 , B = 0.226(1 + n/ 3)“ L778 , a = 3.310(1 + 
n/3)-°- 244 , p = 0.862(1 + n/3)“ a287 , V = 11.55(1 + n/3)-°- 423 , and the linear 
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growth factor has been written as D i = ag(Q) with g( fl) = §n m /[f^ 7 — Q y \ + 
(l+f2 m /2)(l+f2A/70)] [114], For models which are not scale free, such as CDM 
models, the spectral index is taken as u(/cl) = [din P/d In k][k = /c_l/2) [494], 
Extensions of this approach to models with massive neutrinos are considered 
in [417]; for a description of the non-linear evolution of the bispectrum along 
these lines see [568]. 

The ansatz that the non-linear power spectrum at a given scale is a func¬ 
tion of the linear power at larger scales is a reasonable first guess, but this 
cannot be expected to hold in detail. First, as we described in Section 4.2.2, 
mode-coupling leads to a transfer of power from large to small scales (in CDM 
spectra with decreasing spectral index as a function of scale) and the result¬ 
ing small-scale power has a contribution from a range of scales in the linear 
power spectrum. In addition, the mapping above is only based on the pairs 
conservation equation, and thus only takes into account mass conservation. 
The conditions of validity of the HKLM mapping have been explored in [479], 
where it is shown that if the scaled pairwise velocity U 12 /(TLxu) is only a 
function of the average correlation function, U 12 /{fhixi 2 ) = i7(£ av ), then con¬ 
servation of pairs implies 


Lvl(x l ) = exp 


r2 

-3 


£av(#) 


ds 


H (s)(l + ») 


(213) 


where Xl and x are related as in the ffKLM mapping. In linear PT, H = 
2£av/3, and if stable clustering holds H = 1. In general however H cannot 
be strictly a function of £ av alone (e.g. due to mode-coupling in the weakly 
non-linear regime). A recent numerical model for the evolution of the pairwise 
velocity is given in [112], which is used to model the non-linear evolution of 
the average correlation function. 


f.5.5 The Hierarchical Models 

The absence of solutions of the equations of motion in the non-linear regime 
has motivated the search for consistent relations between correlation functions 
inspired by observations of galaxy clustering and the symmetries of dynamics, 
i.e. the self-similar solution. The most common example is the so-called hier¬ 
archical model for the connected p-point correlation function [275,231] which 
naturally obeys the scaling law (208), 

t N n- 1 

^(xi,,..,x JV ) = £g JV>a II ^ ab ■ (214) 

a= 1 labelings edges 


The product is over N — 1 edges that link N objects (vertices) A, B ,..., with a 
two-point correlation function fxY assigned to each edge. These configurations 
can be associated with ‘tree’ graphs, called N-tvees. Topologically distinct N- 
trees, denoted by a, in general have different amplitudes, denoted by Qjv iB , but 
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those configurations which differ only by permutations of the labels 
(and therefore correspond to the same topology) have the same amplitude. 
There are t N distinct iV-trees (t 3 = 1 , £ 4 = 2 , etc., see [232,85]) and a total of 
N n ~ 2 labeled trees. 

I 11 summary, the hierarchical model represents the connected IV-point func¬ 
tions as sums of products of (N — 1) two-point functions, introducing at each 
level only as many extra parameters Qn.o, as there are distinct topologies. In a 
degenerate hierarchical model, the amplitudes Qn,cl are furthermore indepen¬ 
dent of scale and configuration. I 11 this case, Qn, 0 , = Qn, and the hierarchical 
amplitudes Sn — N N ~ 2 Qn- In the general case, it can be expected that 
the amplitudes Qn depend on overall scale and configuration. For example, 
for Gaussian initial conditions, in the weakly non-linear regime, a 2 -C 1, per¬ 
turbation theory predicts a clustering pattern that is hierarchical but not 
degenerate. 

It is important to note that if the degenerate hierarchical holds in the nonlinear 
regime, the Qjv’s should obey positivity constraints. By requiring that the 
fluctuations of the number density of neighbors should be positive, it follows 
that [508] 

Qz > (215) 

This constraint was latter generalized through Schwarz inequalities in [231] to 
get, 


(2 M) 2M ~ 2 Q 2M (2 N) 2N ~ 2 Q 2N > (M + N) 


2N-2, 


\M+N -2 


Qm+n 


(216) 


where M and N are integers or odd half-integers. Similar constraints^ 7 ! have 
been derived in [57], 


(N + 2) n Q n+2 N n 


(218) 


There is no proof, not even indications, that any model fulfilling these con¬ 
straints is mathematically valid. This is a serious limitation for building such 
models. 

Using the BBGKY hierarchy obtained from the Vlasov equation and assuming 
a hierarchical form similar to Eq. (214) for the phase-space N-point distribution 

27 A more physically motivated constraint can be derived by imposing that cluster 
points be more correlated than field points [287,288]. It leads to 



f P~ 1 
V p 


) 


P-3 


Qp—i 


> ... > 


pi 

2P-ipP-2 


(217) 


which appear more stringent than the constraints above. These constraints are sat¬ 
urated in the model of Eq.(220) with Q = 1/2. 
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function in the stable clustering limit Fry [228,231] obtained (N > 3) 


Qn 


Qn,cl 


1 

2 


N \ /4Q 3 xiV-2_ 

N-l' v N > 


( 219 ) 


in this case, different tree diagrams all have the same amplitude, i.e., the 
clustering pattern is degenerate. On the other hand, Hamilton [286], correcting 
an unjustified symmetry assumption in [228,231], instead found 


Qn, 


snake 


= 


Qn, star 0 


( 220 ) 


where “star” graphs correspond to those tree graphs in which one vertex is 
connected to the other (N — 1) vertices, the rest being “snake” graphs (if 
Qz = 1/2 this corresponds to the Rayleigh-Levy random walk fractal described 
in [508]). Summed over the snake graphs, (220) yields 


Qn 


/V! sQ3\N-2 

2 Vj\J 


( 221 ) 


Unfortunately, as emphasized in [286], these results are not physically mean¬ 
ingful solutions to the BBGKY hierarchy, but rather a direct consequence of 
the assumed factorization in phase-space. As a result, this approach leads to 
unphysical predictions such as that cluster-cluster correlations are equal to 
galaxy-galaxy correlations to all orders. It remains to be seen whether physi¬ 
cally relevant solutions to the BBGKY hierarchy which satisfy Eq. (214) really 
do exist. Despite these shortcomings, the results in Eq. (219) and Eq. (220) are 
often quoted in the literature as physically relevant solutions to the BBGKY 
hierarchy! 

Another phenomenological assumption on the parameters which has the 
virtue of being closer to the mathematical structure found in PT, is provided 
by the tree hierarchical model [41,473,57]. In this case the parameters Qn,o. 
are obtained by the product of weights v t associated to each of the vertex 
appearing in the tree structure, 

Q N ,a = n^f (o) . (222) 


In this expression the product is made over all vertices appearing in configu¬ 
ration a, Vi is weight of the vertex connected to i lines and dQa) is the number 
of such vertices. The parameter Qn )CI is therefore completely specified by the 
star diagram amplitudes. This pattern is analogous to what emerges from PT 
at large scales, although the parameters Qn,u are here usually taken to be con¬ 
stant, independent of scale and configuration. But even in the absence of this 
latter hypothesis the genuine tree structurcp^ of the tree hierarchical model 

28 In the sense that any part of the diagram can be computed irrespectively of the 
global configuration. 
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Fig. 19. HEPT compared to N-body simulations for scale-free initial conditions (left) 
and CDM (right). 

turned out to be very useful for phenomenological investigations (see [57] and 
Sect. 7.1). 

4-5.6 Hyperextended Perturbation Theory 

More direct connections with PT results have been proposed to build models 
of non-linear clustering. One is known as the “hyperextended perturbation 
theory” (HEPT, [563])[^]. Its construction is based on the observation that 
colinear configurations play a special role in gravitational clustering, which 
become apparent in the discussion on the bispectrum loop corrections (see 
Sect. 4.2.3). They correspond to matter flowing parallel to density gradients, 
thus enhancing clustering at small scales until eventually giving rise to bound 
objects that support themselves by velocity dispersion (virialization). HEPT 
conjectures that the “effective” Qn clustering amplitudes in the strongly non¬ 
linear regime are the same as the weakly non-linear (tree-level PT) colinear 
amplitudes , as shown in Fig. 16 to hold well for three-point correlations. 

Note that by effective amplitudes Qff the overall magnitude of Q N is under¬ 
stood: it is possible that Qn, for N > 3, although independent of overall scale, 
is a function of configuration. To calculate the resulting Sn parameters, it is 
further assumed that Sn — N N ~ 2 Qjf, that is, the Sn are given by the typical 
configuration amplitude Qjf times the total number of labeled trees, N n ~ 2 , 
neglecting a small correction due to smoothing [85]. The resulting non-linear 
Sn amplitudes follow from tree-level PT [563] 

A — 2 n 

sf W = 3 Qf (n) = 3 rT ^ TT , (223) 


29 A more phenomenological model, EPT (Extended Perturbation Theory), is pre¬ 
sented in Sect. 5.13. 
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(224) 


Sf(n) = 16 QT(n) 


54 - 27 2 n + 2 3” + 6” 
(1 + 6 2 n + 3 3 n + 6 6 n )' 


Sf(n) = 125 gf(n) 


125 N(n) 
6 D{n) 


(225) 


where n is the spectral index, obtained from (n + 3) = —dincr|(i?)/dini?, 
N = 1536 - 11522 n + 1283” + 664 n + 646” - 98” - 212” - 24”, D = 1 + 
122” + 123” + 164” + 246” + 248” + 1212” + 2424”. One can check that these 
Qn amplitudes satisfy the above positivity constraints, Eqs. (216,218) and 
even the constraint in Eq. (217) as long as n < 0.75, which is well within the 
physically interesting range. 

The left panel of Fig. 19 shows a comparison of these predictions with the 
numerical simulation measurements in [150] for scale-free initial conditions 
with VL m = 1. The plotted values correspond to the measured value of S p when 
the non-linear variance a 2 = 100. We see that the N-body results are generally 
in good agreement with the predictions of HEPT, Eqs. (223), (224) and (225), 
keeping in mind that for n = —2 finite-volume corrections to the S p measured 
in the simulations are quite large and thus uncertain (see Sect. 6.12.1). The 
right panel shows a similar comparison of HEPT with numerical simulations 
in the non-linear regime for the SCDM model (T = 0.5, cr 8 = 0.34, [147]). The 
agreement between the N-body results and the HEPT predictions is excellent 
in this case. The small change in predicted value of S p with scale is due to the 
scale-dependence of the linear CDM spectral index. 

Is interesting to note that for n = 0, HEPT predicts S p = (2 p — 3)!!, which 
agrees exactly with the excursion set model developed in [588] for white-noise 
Gaussian initial fluctuations. In this case, the one-point PDF yields an inverse 
Gaussian distribution, which has been shown to agree well in the non-linear 
regime when compared to numerical simulations [588]. This remarkable agree¬ 
ment between HEPT and the excursion set model deserves further study. 
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5 From Dynamics to Statistics: The Local Cosmic Fields 

We have seen in Section 4 that the non-linear nature of gravitational dynamics 
leads, through mode coupling effects, to the emergence of non-Gaussianity. In 
the previous section we have explored the behavior of multi-point correlation 
functions. Here we present statistical properties related to the local density 
contrast in real space. We first describe the results that have been obtained for 
the moments of the local density field. In particular we show how to compute 
the full cumulant generating function of the one-point density contrast at 
tree level. Results including loop corrections are given when known. Finally, 
we present techniques for the computation of the density PDF and various 
applications of these results. When dealing with smoothed fields, we shall 
assume that filtering is done with a top-hat window unless specified otherwise. 

5.1 The Density Field Third Moment: Skewness 

5.1.1 The Unsmoothed Case 

The first non-trivial moment that emerges due to mode coupling is the third 
moment of the local density probability distribution function, characterized 
by the skewness parameter. The computation of the leading order term of 
(5 3 ) is obtained through the expansion (<5 3 ) = (((W) + 5^ + . . .) 3 ). When the 
terms that appear in this formula are organized in increasing powers of the 
local linear density, we have (S 3 ) = ((S^) 3 ) + 3 ((<5d)) 2 5 < - 2 ^) + ..., where the 
neglected terms are of higher-order in PT. The first term of this expansion is 
identically zero for Gaussian initial conditions. The second term is therefore 
the leading order, “tree-level” in diagrammatic language (see Section 4.1). We 
then have I' 1 " | 

(5 3 )^3((5 (1) ) 2 5 (2) ) (226) 

= 3 a 4 J d' 3 k 1 ... | d 3 k 4 {S 1 ( k 4 ) <5 4 (k 2 ) ^ 3 ) <5i(k 4 )) x 

F 2 (k 2 , k 3 ) exp[i(k 4 + k 2 + k 3 + k 4 ) ■ x], (227) 

For Gaussian initial conditions, linear Fourier modes 5 4 (k) can only correlate 
in pairs [Eq. (122)]. If k 2 and k 3 are paired, the integral vanishes [because 
(6) = 0, see the structure of the kernel F 2 in Eq. (45)]. The other two pairings 
give identical contributions, and thus 

(5 3 ) = 6 a 4 J d 3 k 4 J d 3 k 2 P(h) P(k 2 ) F 2 { k 4 , k 2 ). (228) 

Integrating over the angle between k 4 and k 2 leads to (S 3 ) = (34/7)(5 2 ) 2 [508]. 

30 For simplicity, calculations in this section are done for the Einstein-de Sitter case, 

— 1 . 
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For the reasons discussed in Sect. 4.1.1, it is convenient to rescale the third 
moment and define the skewness parameter S 3 (see Sect. 2), 

S 3 = = y + 0(<x 2 ). (229) 

The skewness measures the tendency of gravitational clustering to create an 
asymmetry between underdense and overdense regions (see Fig. 20). Indeed, 
as clustering proceeds there is an increased probability of having large values 
of 5 (compared to a Gaussian distribution), leading to an enhancement of the 
high-density tail of the PDF. In addition, as underdense regions expand and 
most of the volume becomes underdense, the maximum of the PDF shifts to 
negative values of 5. From Eq. (144) we see that the maximum of the PDF is 
in fact reached at 

5max ~ -y or 2 , (230) 

to first order in a. We thus see that the skewness factor S 3 contains very useful 
information on the shape of the PDF. 

5.1.2 The Smoothed Case 

At this stage however the calculation in Eq. (229) is somewhat academic be¬ 
cause it applies to the statistical properties of the local, unfiltered, density 
field. In practice the fields are always observed at a finite spatial resolution 
(whether it is in an observational context or in numerical simulations). The 
effect of filtering, which amounts to convolving the density field with some 
window function, should be taken into account in the computation of S 3 . The 
main difficulty lies in the complexity this brings into the computation of the 
angular integral. To obtain the skewness of the local filtered density, Sr, one 
indeed needs to calculate, 

(4> = 3 ( (4 11 ) 2 Sr) (231) 

with 


<5^=a J d 3 k5(k) exp[ik • x] W 3 {k\ R ), 

4? 2) = a 2 J d 3 k! J d 3 k 2 <5(ki) <f(k 2 ) exp[i(k! + k 2 ) • x] x 

(232) 

F 2 (kx,k 2 ) Wadki + kalfl), 

(233) 

where W-.^k) is the 3D filtering function in Fourier space. It 
expression for the third moment, 

leads to the 

(4) = 6 a 4 J d 3 ki j d 3 k 2 P(h) P{k 2 ) W 3 (ki R ) W 3 (k 2 R) x 


F 2 (k 1 ,k 2 )W 3 (|k 1 + k 2 | J R), 

(234) 
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so that the relative angle between ki and k 2 appears in both F 2 and W 3 . The 
result depends obviously on the filtering procedure. It turns out that the final 
result is simple for a top-hat filter in real space. I 11 this case, 


W 3 (k) 


[to J3/2(k) 

V 2 k 3 / 2 


3 

k 3 


[sin(fc) — k cos(fc)]. 


(235) 


Following the investigations initiated in [355] for the properties of the top-hat 
window function^ it can be shown (see [46] and Appendix C), 


dn 


12 


47T 
r dfli2 
J An 


W 3 (|ki + k 2 |) 1- 
Wadki + ka]) 


(k x • k 2 ) 2 
h 2 k 2 

tvi 

ki • k 2 

ki 

W 3 (k 1 ) 


1 + 


= -w 3 (h)w 3 (k 2 ) 


w 3 (k 2 ) + -k 2 W'(k 2 ) 


(236) 


(237) 


It is easy to see that F 2 can be expressed with the help of the two polynomials 
involved in the preceding relations. One finally obtains [46], 


, = 34 d log cr 2 (R) 
7 d log R 


(238) 


The skewness thus depends on the power spectrum shape (mainly at the fil¬ 
tering scale). For a power-law spectrum, P(k) oc k n , it follows that S 3 = 
34/7 — (n + 3) [355]. Galaxy surveys indicate that the spectral index n is of 
the order of n & —1.5 close to the non-linear scale. Comparisons with numer¬ 
ical simulations have shown that the prediction of Eq. (238) is very accurate, 
as can be seen in Fig. 27. 


5.1.3 Physical Interpretation of Smoothing 

To understand the dependence of the skewness parameter with power spec¬ 
trum shape it is very instructive to examine in detail the nature of the con¬ 
tributions that appear when the filtering effects are taken into account. 

For this purpose let us consider the same problem in Lagrangian space. If one 
calculates Jthe second-order expansion of the Jacobian, one obtains [from 
Eqs. (90,94) and assuming Q m = 1], 


J® = a 2 


d 3 ki / d 3 k 2 5(k x ) J(k 2 ) exp[i(k, + k 2 ) ■ q] 


1 - 


(^ • k 2 ) 2 ] 
(239) 


,:!1 These properties have been obtained from the summation theorem of Bessel func¬ 
tions, see e.g. [681]. Such relations hold in any space dimension for top-hat filters. 



















Fig. 20. Skewness is a measure of the asymmetry of the local density distribution 
function. It appears because underdense regions evolve less rapidly than overdense 
regions as soon as nonlinearities start to play a role. The dependence of skewness 
with the shape of the power spectrum comes from a mapping between Lagrangian 
space, in which the initial size of the perturbation is determined, and Eulerian space. 
For a given filtering scale R, overdense regions come from the collapse of regions that 
had initially a larger size, whereas underdense regions come from initially smaller 
regions. As a result, the skewness is expected to be smaller for power spectra with 
more small scale fluctuations (steep spectra case, that is when k 3 P(k ) is rapidly 
increasing with k). 

This gives for the density [e.g. Eq. (91)], once the Jacobian (which is a direct 
estimation of the volume) has been filtered at a given Lagrangian scale R , 


5^ = J d 3 k x j d 3 ki a 2 <5(k x ) J(k 2 ) exp[i(ki + k 2 ) ■ q] x 
W(k 1 R)W(k 2 R)-^W(\k 1 + k 2 \R) 


(240) 
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Fig. 21. The skewness S 3 as a function of ii m for zero-f^A Universes (solid lines) 
and flat universes with il rn + = 1 (dashed lines). The upper and lower curves 

correspond to a power law spectrum with n = —3 and n = —1, respectively. 


Because smoothing effects are calculated in Lagrangian space (denoted by q), 
this expression is different from the Eulerian space filtering result, Eq. (233). 
In fact, it follows that = 34/7 even when filtering effects are taken into 
account. The mere fact that one does not obtain the same result should not 
be surprising. In this latter case the filtering has been made at a given mass 
scale. The difference between the two calculations comes from the fact that 
the larger the mass of a region initially is, the smaller the volume it occupies 
will be. Filtering at a fixed Eulerian scale therefore mixes different initial mass 
scales. The asymmetry will then be less than one could have expected because, 
for a standard hierarchical spectrum, larger mass scales correspond to smaller 
fluctuations. 


5.1.4 Dependence of the Skewness on Cosmological Parameters 
As the skewness is induced by gravitational dynamics, it is important to know 
how much it can depend on cosmological parameters. In general the parameter 
S 3 depends on the growth rate of the second-order PT solution, see Sect. 2.4.3, 
through 


S 3 — 3 Z/ 2 + 


dlog(T 2 (i?) 
cl log R 


(241) 


Explicit calculations [91] have shown that z/ 2 can be well approximated by 


Vo ~ 1 + -fi- 2 /63 
~ g 1 L m ? 




(242) 






obtained by expansion about Vt m = 1 for Ha = G We then have the following 
result, 


S 3 = y + ® (!J m aos - l) - (n + 3). (243) 

A similar result follows when Ha 7 ^ 0, see [46,313] and also [223]. In practice, 
for current applications to data, such a small dependence on cosmological 
parameters can simply be ignored, as illustrated in Fig. 21. This turns out to 
be true even when cosmologies with non-standard vacuum equation of state 
are considered (e.g. quintessence models) [366,259,34], 


5.1.5 The Skewness of the Local Velocity Divergence 

The skewness of the velocity divergence can obviously be calculated in a similar 
fashion. However, because of the overall /(fi m , Ha) factor for the linear growth 
of velocities, it is natural to expect that the velocity divergence skewness 
parameter, T 3 , has a significant H m dependence [50]. In general, 


T 3 = 


( 0 3 ) 

( 1 9 2 ) 2 


1 

/(H m ,H A ) 


3 /i 2 + 


d log cr 2 (i?) 
d log R 


(244) 


Taking into account the specific time dependence of /i 2 we get, 
1 


T, = 


/(H m ,H A ) 


2 + Wy/- + dlosg2(fi) 

7 m dlogi? 


(245) 


which within a very good accuracy implies that T 3 rs — [26/7—(n+3)]/H[/ 6 for 
a power-law spectrum. This makes the dimensionless quantity T 3 a very good 
candidate for a determination of Vt m independent of galaxy biasing. Attempts 
to carry out such measurements, however, faced very large systematics in the 
data [50]. So far no reliable constraints have been drawn from this technique. 


5.2 The Fourth-Order Density Cumulant: Kurtosis 

The previous results can be applied to any low-order cumulants of the cosmic 
held. Fry [232] computed the fourth cumulant of the cosmic density held, but 
without taking into account the filtering effects. These were included later for 
top-hat [46] and Gaussian hlters [407]. 

Formally the fourth-order cumulant of the local density is given by, 

(5 4 ) c = (<5 4 ) — 3 (S 2 ) 2 (246) 

= 12 ((<5 (1) ) 2 (<5 (2) ) 2 ) c + 4 ((<5 (1) ) 3 <5 (3) ) c . 

32 But it is valid for all values of Q m of cosmological interest. 












In these equations it is essential to take the connected part only. There are 
terms that involve loop corrections to the variance that are of the same order 
in cr but they naturally cancel when the non-connected part of the fourth 
moment is subtracted out. The consequence is that, 

(5 4 ) c ~ <<5 2 ) 3 , (247) 

and one can define the kurtosis parameter S 4 , 

S4 = ( S 4 ) c /(6 2 ) 3 . (248) 


This equation allows one to compute the leading part of S 4 in the weakly non¬ 
linear regime. In general S 4 can be expressed in terms of the functions D\. 
u 2 and z/ 3 . This can be obtained by successive applications of the geometrical 
properties of the top-hat window function (see [46] and appendix C for details). 
We have, 


S4 — 4 z / 3 + 12 z / 2 + ( 14 z / 2 — 2 ) 


dlog[ff 2 (i7 0 ); 
d log R 0 


+ 


7 ( dlog[u 2 (i7o)] \ 2 2 d 2 log[u 2 (i? 0 ); 

+ 3 ^ dlog Rq J + 3 dlog 2 i? 0 

For a power law spectrum of index n this leads to 


„ 60712 62 7. x2 

S 4 =-(n + 3) H —(n + 3) 2 . 

1323 3 v ; 3 V ; 


(249) 


(250) 


This result is exact for an Einstein-de Sitter universe. It is extremely accurate, 
within a few per cent for all models of cosmological interest. Similar results 
can be obtained for the velocity divergence. 


5.3 Results for Gaussian Smoothing Filters 

So far we have been giving results for a top-hat filter only. The reason is that 
they can be given in a closed form for any shape of the power spectrum. An¬ 
other quite natural filter to choose is the Gaussian filter. In this case however 
there are no simple closed forms that are valid for any power spectrum shape. 
Results are known for power-law spectra only [355,436,407]. 

The principle of the calculation in this case is to decompose the angular part 
that enters in the window function as a sum of Legendre functions, 

OO I ~ 

El-irpm + i )./—/ m+i (p 9) P m( cce ip), (251) 

■S) V 2 pi 

where I m+ i(pq ) are Bessel functions. The integration over (p is made simple 
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by the orthogonality relation between the Legendre polynomials. Finally each 
term appearing in the decomposition of the Bessel function, 


m\r(u + m + 1) V2 


h+2m 


can be integrated out for power-law spectra since, 

J,« e -« 2 d 9 = ir /a + r 


(252) 


(253) 


which after resummation leads to hypergeometric functions of the kind 2 -^ 1 - 
Eventually the result for S3 is 


s 3 = s 2 F 1 


n + 3 n + 3 3 1 
2 ’ 2 ’ 2’4 



/n + 3 n + 3 5 1 
V 2 ’ 2 ’ 2’4 


(254) 


and similarly the velocity skewness is 


T 3 — —3 2 Fi 


n+ 3 n +3 


3 1 
2 ’ 4 


+ \ n + 


16\ 

Y) 


iFx 


n + 3 n + 3 5 I s 
2 ’ 2 ’ 2’ 4, 

(255) 


This result is exact for an Einstein-de Sitter Universe but obviously, as for the 
top-hat filter, S 3 is expected to depend only weakly on cosmological parame¬ 
ters and the dominant dependence of T 3 is that proportional to 1 / f(Q m ). The 
result for S 3 is shown as a dashed line in Fig. 26. 

The kurtosis cannot be calculated in closed form even for power-law spectra 
(although a semi-analytic formula can be given [407]). However there exists a 
simple prescription that allows one to get an approximate expression for the 
kurtosis. It consists in using the formal expression of the kurtosis obtained for 
a top-hat filter but calculated for n = n e s such that it gives the correct value 
for the skewness. Such a prescription has been found to give accurate results, 
about 1% accuracy for n = —1 [407]. 


5-4 The Density Cumulants Hierarchy 

In general the nonlinear couplings are going to induce non-zero cumulants at 
any order. We can define [270] 

S P = (S p )J(Sr~\ (256) 

that generalizes the S3 and S 4 parameters considered in the previous section. 
All these quantities are finite (and non-zero) at large scales for Gaussian initial 
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Fig. 22. Diagrammatic representation of 5^ p \ Each line stands for a factor <5(k). 

conditions and can in principle be computed from PT expansions. However, 
the direct calculation of S p becomes extremely difficult with increasing order 
p due to the complexity of the kernels F p and G p . Fortunately, it turns out to 
be possible to take great advantage of the close relationship between the S p 
parameters and the vertices v p describing the spherical collapse dynamics, as 
described in Sect. 2.4.2, to compute the S p parameters for any p. 

In the derivation presented here we adopt a pedestrian approach for build¬ 
ing, step by step, the functional shape of the cumulant generating function. 
A more direct approach has recently been developed in [660,661] in which the 
generating function of the cumulant is obtained directly, via a saddle-point 
approximation in the computation of the cumulant generating function which 
corresponds to its tree-order calculation. This approach avoids technical diffi¬ 
culties encountered in the computation of the Lagrangian space filtering prop¬ 
erties and in the Lagrangian-Eulerian mapping and is certainly an interesting 
complementary view to what we present here. 

5-4-1 The Unsmoothed Density Cumulant Generating Function 
The computation of S p coefficients is based on the property that each of them 
can be decomposed into a sum of product of “vertices”, at least when filtering 
effects are not taken into account. As seen before, S 4 = 12 z/f + A 103 . This 
property extends to all orders, so that the S p parameters can be expressed as 
functions of UqS only (q — 2,... ,p — 1). Note that the vertices v p defined in 
Eq. (48) as angular averages of PT kernels correspond to 

v r = (iW[i< 1 )]f) c /<[«S< 1 )] 2 )». (257) 

This decomposition of S p into a sum of product of vertices can be observed 
easily in a graphical representation. Indeed 

Wc = E < 5(91) • ■ ■ 5(ap) ) c . ( 258 ) 

where each 5 has been expanded in PT. Each 5^ contains a product of q 
random Gaussian variables <5(k). Each of these points can be represented by 
one dot, so that when the ensemble average is computed, because of the Wick 
theorem, dots are connected pairwise. The 5^ therefore can be represented 
as in Fig. 22 with q outgoing lines. 

Diagrams that contribute to the leading order of S p are those which contain 
enough dots so that a connected diagram that minimizes the number of links 
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Fig. 23. Computation of the simplest graphs. Each line represents a factor a 2 . Ver¬ 
tices are obtained from the angular average of the wave vectors leaving u p . 




Fig. 24. A graph contributing to S 5 . 

can be built. The number of links for connecting p points is p — 1, we should 
then have J2iQi — 2 (p — 1 ) so that 


S P = 


{8 {qi) ...8 {q ^)J( [£ (1) 

graphs, J2iH= 2 (p- 1) 


i 2 


P ~ 1 


(259) 


An example of such a graph for S§ is shown in Fig. 24. 

It is worth noting that all these diagrams are trees, so that the integration 
over the wave vectors can be made step by stepp 3 ]. Then the value of each 
diagram is obtained by assigning each line to the value of a 2 and each vertex 
to u p depending on the number p of lines it is connected to, see e.g. Fig. 23. 
This order by order decomposition can actually be replaced by a functional 
relation at the level of the generating functions. If we define the generating 
function of S p as 

00 (-v) p 

•P(y) = £ -S r MA (Si = s 2 = 1 ), ( 260 ) 


and the vertex generating function as 


OO 


Gs(r) = J2 u p 

p =1 



(261) 


it is possible to show that ip and Q$ are related to each other through the 
system of equations 


‘f(y)=yGs[T(y)] + ^r 2 (y), (262) 

r(y) =-y G's[T(y)}. (263) 

33 This is possible however only when smoothing effects are neglected. 
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Fig. 25. Graphical representation of Eq. (263), r is the generating function of graphs 
with one external line. 

The demonstration of these equations is not straightforward and is given in 
Appendix B. To get some insight about these two equations, one can note that 
r is the conjugate variable to the one-line vertex (that is u \, set to unity at 
the end of the calculation). As such, it corresponds to the generating function 
of all graphs with one external line. It is then solution of an implicit equation, 
illustrated in Fig. 25, which corresponds to Eq. (263). Naturally, it involves 
the vertex generating function. It is to be noted however that in this perspec¬ 
tive the equations (262,263) and the parameter y have no intrinsic physical 
interpretation. It has been pointed out recently in [660,661] that this system 
can actually be obtained directly from a saddle-point approximation in the 
computation of the local density contrast PDF. It gives insights into the phys¬ 
ical meaning of the solutions of Eq. (263). We will come back to this point in 
Sect. 5.8. 

Recall that vertices describe the spherical collapse dynamics (see Sect. 2.4.2), 
thus Gs(t) corresponds to the density contrast of collapsing structures with 
spherical symmetry when (—r) is its linear density contrast. The first few 
values of u p can then be easily computed, 


V \2 = 


34 

21 ’ 


Ds = 


682 

189’ 


z/ 4 = 


446440 
43659 ’ 


(264) 


which implies, 


S3 — 3z/ 2 

£4 = 4z/ 3 

£5 = 5z / 4 
S 6 = 6z/ 5 


- 

“ T’ 

+12^2 


60 712 
1323 


+ 6OZ/3Z/2 + 60^2 


« 45.89; 

_ 200 575 880 
“ 305 613 


« 656.3; 


+ 120z/ 4 z / 2 + 90z/f + 72 OZ/ 3 Z /2 + 360z/2 ~ 12, 700 


(265) 

(266) 

(267) 

(268) 


At this stage however, the effects of filtering have not been taken into account. 

5-4-2 Geometrical Properties of Smoothing in Lagrangian Space 
As the examination of the particular case of S 3 has shown, the smoothing 
effects for a top-hat filter are entirely due to the mapping between Lagrangian 
space and Eulerian space. This can be generalized to any order [44], 

The Lagrangian space dynamics is jointly described by the displacement field 
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(that plays a role similar to the velocity field) and the Jacobian, whose inverse 
gives the density. The latter can be expanded with respect to the initial density 
contrast, 


J (q) = 1 + J {1) (q) + J (2) (q) + • • • (269) 

At a given order we will havcP 7 ], 

j(p)( q ) =a p J ^ki /2 ... J p ( k 1; ..., kp) exp [iq ■(k l + ...+ kp)]. 

(270) 

The Jacobian is actually given by the determinant of the deformation tensor, 
obtained from the first derivative of the displacement field, \k, see Eq. (91). 
The precise relation is 


J(q) = 


fix 

1 

/ \ 2 ^^ 

<9q 

= 1 + Vq ■ - 

(Vq ->|/ V f iA( 

ij 

1 

+ 6 

1- 

& 

CO 

I 

GO 

<1 

& 

\I> Y dhjd'j,* + 2 Y dj 1 

ij ijk 


. (271) 


The equations of motion are closed by the Euler equation, Eq. (90). This 
shows that the kernels of the Jacobian expansion are built recursively from 
the function /J(ki,k 2 ) = 1 — (ki • k 2 ) 2 /(fcifc 2 ) 2 and 


77(ki,k 2 ,k 3 ) 


fki ■ k 2 \ 2 /k 2 -k 3 \ 2 /k 3 ■ k 3 \ 2 

~\k^) ~\hkTj 

i o h • k 2 k 2 ■ k 3 k 3 • ki 

~ r h 2 k 2 k 2 


(272) 


We have seen previously that a top-hat filter commutes with /3. It can also be 
shown that, 


dfii dfl 2 dfl 3 

47T 47T 47T 
_ 2 

“ 9 


■ W (|ki + k 2 + k 3 | R) rj(ki, k 2 , k 3 ) = 
W(ki R) W(k 2 R)W(k 3 R). 


(273) 


Here again, an exact “commutation property” is observed. Successive appli¬ 
cations of these geometrical propertied 5 ] then imply that [45], 

34 We assume = 1, but the calculations trivially extend to all cosmologies. 

35 This demonstration is incomplete here because the displacement in Lagrangian 
space is not in general potential (see [45] for a more complete demonstration). 
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jp = ^p(ki,..., k p ) W(|ki + ... + k p |i?) 

= Jp(k 1 ,...,kp) W{ h R)... W(k p R), 


(274) 

(275) 


where a bar denotes angular averaged quantities. This is a generalization of 
the results obtained for parameter S 3 , which has been found to be insensitive 
to filtering effects in Lagrangian space (for a top-hat filter only). 


5-4-3 Lagrangian to Eulerian Space Mapping: Smoothed Case 

As for the skewness S 3 , a mapping between Lagrangian and Eulerian space 

should permit one to calculate the S p ’s at any order p. 

The hierarchy in Eq. (275) gives implicitly the cumulant generating function of 
the volume distribution function for a fixed mass scale. One can then make the 
following remark: the probability that a mass M occupies a volume larger than 
V is also the probability that a volume V contains a mass lower than M. It 
suffices for that to consider concentric spheres around a given point x 0 p^]. It is 
therefore possible to relate the real space density PDF to the Lagrangian space 
one. At this stage however we are only interested in the leading order behavior 
of the cumulants. We can then notice that, in the small variance limit, the 
one-point density PDF formally given by Eq. (142), can be calculated by the 
steepest descent method. The saddle point position is given by the equation, 
dtp(y)/dy = 5, and in addition we have d(p(y)/dy = Qfir), when r is given 
implicitly by Eq. (263). The saddle-point position is therefore obtained by a 
simple change of variable from the linear density r to the nonlinear density 
contrast 5. It implies that the one-point PDF is roughly given by 

p(5)d5 ~ exp (276) 


with a weakly 5-dependent prefactor. It is important to note that the leading 
order cumulants of this PDF do not depend on these prefactors. They are 
entirely encoded in the r-S relation. 

As suggested in the previous paragraph, if we now identify Pe(5 > 5 0 ) and 
Pl (5 < 5 0 ) ( one being computed at a fixed real space radius, the other at a 
fixed mass scale) we obtain a consistency relation 


2a 2 (R) 2 cr 2 [(1 + 5 ) 1 / 3 A] ’ 

so that the two have the same leading-order cumulants. Here and in the fol¬ 
lowing we use indices L or E for variables that live respectively in Lagrangian 
space or Eulerian space. More precisely we denote by p> L the cumulant generat¬ 
ing function in Lagrangian space and Qfi the corresponding vertex generating 

36 This statement is however rigorous for centered probabilities only. 


(277) 
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Fig. 26. The predicted S p parameters for power law spectra as functions of the 
spectral index. The results are shown for top-hat filter except for the dashed line 
which corresponds to the skewness for a Gaussian filter. 

function. In Enlerian space we use the E superscript^] In the previous equa¬ 
tion, the density contrast is a parameter given a priori. The variables t e and 
tl depend formally on 5 through the saddle-point equations, 

<5 = Qs{t l ) = gf{r E ), (278) 


and in Lagrangian space er is taken at the mass scale corresponding to the 
density contrast 5 (a is computed a priori in Eulerian space). 

From these equations we can eliminate tl to get an implicit equation between 
Gf and t e , 


G?(t e ) 


Gs 



(1 + Gf{TE))WR 
a(R) 



(279) 


where G${jl) is known and is obtained from spherical collapse dynamics. The 
cumulant generating function, ip E (y), is then built from G e [t e ) the same way 
as tp L (y) was from Gs( t l ) [Eqs. (262) and (263)]. 

Expanding this function around y = 0 leads to explicit expressions for the first 
few values of S p . They can be written as functions of successive logarithmic 
derivatives of the variance, 

_ d p log cr 2 (R) 
lp = dlog P R 

and read 

37 It is always possible to assume that there exists a function Qf associated to 
y> E , even if there is no associated diagrammatic representation, assuming the same 
formal functional relation between them. 


(280) 
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Fig. 27. The S p parameters for 3 < p < 7. Comparisons between theoretical predic¬ 
tions and results from numerical simulations (from [28]) (as is the linear variance 
in a sphere of radius 8 h -1 Mpc). 


S 3 = y + 7 i, 

_ 60712 62 7 i 77 i 2 2 72 

4_ 1323 + 3 3 3 ’ 

c , 200575880 1847200 71 6940 71 2 235 71 3 

5_ 305613 + 3969 + 63 + 27 

1490 72 50 7172 10 73 

63 9 27 ’ 

c , 351903409720 3769596070 71 17907475 71 2 

6_ 27810783 + 305613 + 3969 + 

138730 71 s 1210 71 4 3078965 72 23680 7172 

189 + 27 + 3969 + 63 + 

410 71 2 72 35 7 2 2 3790 73 130 7173 574 

9 9 189 27 + 27 ’ 


(281) 

(282) 

(283) 


(284) 


For a power-law spectrum, these coefficients depend only on spectral index n, 
through 71 = — (n + 3) and 7 * = 0 for i > 2. They are plotted as functions 
of n in Fig. 26. They all appear to be decreasing functions of n. The above 
predictions were compared against numerical experiments, as illustrated in 
Fig. 27 for CDM. The agreement between theory and measurements is close 
to perfect as long as the variance is below unity. It is quite remarkable to see 
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Table 6 

Tree-level and one-loop corrections predicted by various non-linear approximations. 


Moment expansions 

«2,4 

*5*3,0 

S 3,2 

S 4 ,o 

S 4 ,2 

FFA, Unsmoothed 

0.43 

3 

1 

16 

15.0 

LPA, Unsmoothed 

0.72 

3.40 

2.12 

21.22 

37.12 

ZA, Unsmoothed 

1.27 

4 

4.69 

30.22 

98.51 

Exact PT, Unsmoothed 

1.82 

4.86 

9.80 

45.89 

- 

Exact PT, Top-Hat Smoothing, n = —2 

0.88 

3.86 

3.18 

27.56 

- 

Exact PT, Gaussian Smoothing, n = —2 

0.88 

4.02 

3.83 

30.4 

- 


that the validity domain of PT results does not deteriorate significantly when 
the cumulant order increases. 

5.5 One-Loop Corrections to One-Point Moments 

We now consider results that include the dependence of S p parameters on the 
variance. Due to the complexity of these calculations, only few exact results 
are known, but there are useful approximate results from the spherical collapse 
model. 

5.5.1 Exact Results 

To get loop corrections for the one-point density moments, it is necessary to 
expand both the second moment and the higher-order moments with respect 
to the linear variance cr^, 

OO 

^ + E S 2 ,n °L, (285) 

71=3 


and 


s r M = S pJ , + £ S P L „ < Tl (286) 

n= 1 

Note that for Gaussian initial conditions, the contributions with n odd van¬ 
ish. The S p parameters can also be expanded with respect to the non-linear 
variance, 


OO 

= S p ,o + 53 Sp, n a n , 

n=l 


(287) 


and it is easy to see that, S p> 2 = S p2 , S p ,4 = Sp, A — Sp^s^ etc... for Gaus¬ 
sian initial conditions. Table 6 shows the results of one-loop corrections in 
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Table 7 

Values for the higher-order perturbative contributions in the SC model for the 
unsmoothed (n = —3) and smoothed (n = —2,—1,0) density fields, for a top- 
hat filter an a power-law power spectrum. When known exact one-loop results are 
quoted in brackets. More details can be found in [222], 


SC 

Unsmoothed 


Smoothed 



n = — 3 

n = —2 

n = —1 

n = 0 

S2,4 

1.44 [1.82] 

0.61 [0.88] 

0.40 [ 00 ] 

0.79 [ 00 ] 

S2fi 

3.21 

0.34 

0.05 

0.68 

S3,0 

4.86 

3.86 

2.86 

1.86 

^3,2 

10.08 [9.80] 

3.21 [3.18] 

0.59 [ 00 ] 

-0.02 [ 00 ] 

S3A 

47.94 

3.80 

0.07 

0.06 

Sa, 0 

45.89 

27.56 

13.89 

4.89 

oL 

°4,2 

267.72 

63.56 

7.39 

-0.16 

oL 

°4,4 

2037.2 

138.43 

1.99 

0.31 


various approximations to the dynamics described in Sect. 2.8 (frozen flow 
approximation, FFA; linear potential approximation, LPA; and ZA), and ex¬ 
act PT [557]. These results, however, ignore the effects of smoothing which, 
as is known from tree-level results, are significant. 

Taking into account smoothing effects in the exact PT framework has only 
been done numerically for the case n = —2, where the one-loop bispectrum 
yields a closed form [559]. The resulting one-loop coefficients are shown in 
Table 6 as well, for top-hat and Gaussian smoothing. When n > —1, one-loop 
corrections to S 3 diverge, as for the power spectrum and bispectrum. 

5.5.2 The Spherical Collapse Model Approximation 

Given the complexity of loop calculations, approximate expressions have been 
looked for. The so-called spherical collapse (SC) model prescription [222] pro¬ 
vides a nice and elegant way for getting approximate loop corrections for the 
local cumulant&l 38 [ 

This model consists in assuming that shear contributions in the equations 
of motion in Lagrangian space can be neglected, which implies that density 
fluctuations grow locally according to spherical collapse dynamics. In this case, 
the cumulants can be obtained by a simple nonlinear transformation of the 
local Lagrangian density contrast 5, 

5 = (1 + g s (-5 h n)) ( [1 + ^Mnn)] -1 ) L - 1 , (288) 

38 Another prescription, which turns out to be not as accurate, is given in [534]. 


100 








expressed in terms of the linear density contrast 5n n assumed to obey Gaus¬ 
sian statistics. Note that the ensemble average in Eq. (288) is computed in 
Lagrangian spactp 7 } Given the fact that the usual ensemble average in Eule- 
rian space is related to the Lagrangian one through {X) L = ((1 + d) A"), the 
normalization factor ([1 + <7<s(<5ii n )] 1 )l is required to obey the constraint that 

<( 1 + < 5 )-^ = (!>* = 1 - 

For Gaussian initial conditions, the SC model reproduces the tree-level re¬ 
sults. Its interest comes from the fact that estimates of loop corrections can 
be obtained by pursuing relatively simple calculations to the required order. 
In addition, as we shall see in the next section, it allows a straightforward 
extension to non-Gaussian initial conditions. The smoothing effects, as shown 
from calculations exact up to tree level, introduce further complications but 
can be taken into account by simply changing the vertex generating function 
Gs in Eq. (288) to the one found in Eq. (279). Rigorously, this equation is 
valid only at tree level: its extension to loop corrections in the SC model can 
hardly be justified^], but turns out to be a good approximation. 

When comparisons are possible, the SC model is seen to provide predictions 
that are in good agreement with exact PT results (see Table 7), in partic¬ 
ular for the S p parameters. Indeed, for the variance (or cumulants), the SC 
prescription does not work as well (see e.g. Fig. 28). The reason for this are 
tidal contributions, which are neglected in the SC approximation and lead 
to the previrialization effects discussed for the exact PT case in Sect. 4.2.1. 
Tidal effects tend to cancel for the S p because of the ratios of cumulants in¬ 
volved. In the SC prescription no divergences are found for n > —1, thus the 
interpretation of those remains unresolved. 

When tested against numerical simulations, the SC model provides a good 
account of the departure from tree-level results as illustrated by Fig. 28 for 
CDM models (see also Fig. 37 in Sect. 5.13). 

5.6 Evolution from Non-Gaussian Initial Conditions 

We now discuss the effects of non-Gaussian initial conditions on the evolution 
of smoothed moments of the density field. As pointed out in Sect. 4.4, this 
is a complicated subject due to the infinite number of possible non-Gaussian 
initial conditions. For this reason, there are few general results, and only some 
particular models have been worked out in detail. Early work concentrated 
on numerical simulation studies [464,684,139] of models with positive and 
negative primordial skewness and comparison with observations. In addition, 


39 Which means that all matter elements are equally weighted, instead of volume 
elements. 

40 In the SC model, the kernels in the Jacobian of the mapping from Lagrangian 
to Eulerian space present no angular dependence, and this is actually incompatible 
with the commutation property in Eq. (275). 
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Fig. 28. Non-linear evolution of the variance (left panels) and of the skewness pa¬ 
rameter S 3 (right panels) from 10 realizations of flat CDM IV-body simulations. Two 
models are considered, ACDM with f 1 m + = 1 and T = 0.2, and SCDM with 

f l m = 1 and r = 0.5, where T is the shape parameter of the power-spectrum [201]. 
In the left panels, symbols show the ratio of the non-linear to the linear variance as 
a function of smoothing radius. The value of T is indicated on the panels, while < 7 g 
stands for the linear variance in a sphere of radius 8 h^ 1 Mpc. The SC model pre¬ 
dictions are shown as a short-dashed line while one-loop PT predictions are shown 
as a solid line. The arrows indicate where 07 = 0.5. In the right panels, the output 
times correspond to ag = 0.5 (top) and erg = 0.7 (bottom). Squares and triangles 
correspond to measurements in T = 0.2 and T = 0.5 simulations, respectively. Each 
case is compared to the corresponding PT tree-level predictions (solid lines) and SC 
model (long-dashed). From [222], 

a number of studies considered the evolution of higher-order moments from 
non-Gaussian initial conditions given by cosmic strings [146,9] and texture 
models [252] using numerical simulations. Recently, measurements of higher- 
order moments in numerical simulations with \ 2 N initial conditions with N 
degrees of freedom were given in [689]. 

General properties of one-point moments evolved from non-Gaussian initial 
conditions were considered using PT in [238,333,124,255,195]. To illustrate the 
main ideas, let us write the PT expression for the first one-point moments: 

(5 2 ) = (<5^) + 2 ( 8\8 2 ) + ( 8 \) +2 () 8-0(a 5 ), (289) 

() = [ ( 8\ ) ] + 3 ( 8\8 2 ) + [3 ( Sfa > +3 ( 8*8 3 ) ] + 0(cr 6 ) , (290) 

() = ( Sf) + [4 ( 8f8 2 ) ] + 6 ( 8\8 2 2 ) +4 ( 8\8 3 ) +0(a 7 ), (291) 

where we simply use the PT expansion 8 = ... Square brackets denote 

terms which scale as odd-powers of 8\, and thus vanish for Gaussian initial 
conditions. A first general remark one can make is that these additional terms 
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give to non-Gaussian initial conditions a different scaling than for the Gaussian 
case [238,124], In addition, the other terms in the skewness have contribution 
from non-Gaussian initial conditions as well; this does not modify the scaling of 
these terms but it can significantly change the amplitude. When dealing with 
non-Gaussian initial conditions, the time-dependence and scale dependence 
must be considered separately. To illustrate this, consider the evolution of the 
S p parameters as a function of smoothing scale R and redshift z, assuming for 
simplicity Q m = 1 [so that the growth factor is a(z) = (1 + z) -1 ], at largest 
scales where linear PT applies we have 

S p (R, Z )~(l + zy~ 2 S I p (R). (292) 

For dimensional scaling models, where the initial conditions satisfy Sp(R) ~ 
[oy(.R)] 2 ^ p , this implies S p (R,z ) ~ [aj(R, z)] 2 ~ p ] that is, the S p parameters 
scale as inverse powers of the variance at all times. Note, however, that 
Eq. (292) is more general, it implies that irrespective of scaling considerations, 
in non-Gaussian models the S p parameters should be an increasing function 
of redshift ; this can be used to constrain primordial non-Gaussianity from ob¬ 
servations^. However, we caution that, as mentioned in Sect. 4.4, all these 
arguments are valid if the non-Gaussian fluctuations were generated at early 
times, and their sources are not active during structure formation. 

At what scale does the approximation of linear perturbation theory, Eq. (292), 
break down? The answer to this question is of course significantly model 
dependent, but it is very important in order to constraint primordial non- 
Gaussianity. Indeed, we can write the second and third moments from Eqs. (182) 
and (183) 

a 2 {R)=a 2 (R) + 2 J d 3 k W 2 (kR) J d 3 qF 2 (k + q, -q) B 1 ( k,q), (293) 

( S 3 (R )) = ( 6 3 (R) ) + ( S 3 g (R) ) + J d 3 k! J d 3 k 2 W(k 3 R)W(k 2 R)W(k 12 R) 

X J d 3 qF 2 (k 1 + k 2 - q,q) P/(k 1 ,k 2 ,ki + k 2 - q,q), (294) 

where k \ 2 = |ki + k 2 |, B 1 and Pf denote the initial bispectrum and trispec¬ 
trum, respectively, and the subscript “G” denotes the usual contribution to 
the third moment due to gravity from Gaussian initial conditions. Therefore, 
as discussed in Sect. 4.4 for the bispectrum, corrections to the linear evolu¬ 
tion of S 3 depend on the relative magnitude of the initial bispectrum and 
trispectrum compared to the usual gravitationally induced skewness. 

This model dependence can be parametrized in a very useful way under the 
additional assumption of spherical symmetry. In the spherical collapse model, 

41 Such a method is potentially extremely powerful, as galaxy biasing would tend 
if anything to actually decrease the S p parameters with z, as bias tends to become 
larger in the past, see e.g. [635] and discussion in Chapter 8. 
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Table 8 

Values of the higher-order perturbative contributions in the SC model from non- 
Gaussian initial conditions with Bj = 1 for the unsmoothed (n = — 3) and smoothed 
(n = —2, —1,0) density fields for a top-hat window and a power-law spectrum. 


SC 

Unsmoothed 


Smoothed 


Bj = 1 

n = — 3 

n = —2 

n = — 1 

n = 0 

■52,3 

0.62 

0.29 

-0.05 

-0.38 

S2,4 

1.87 

0.74 

0.44 

0.98 

■52,5 

3.36 

0.60 

-0.05 

-1.05 

*5*3,0 

5.05 

4.21 

3.38 

2.55 

si 1 

7.26 

3.91 

1.55 

0.19 

C < L 
°3,2 

23.53 

7.37 

1.18 

0.20 

S 4 -I 

19.81 

16.14 

12.48 

8.81 

*S*4,0 

85.88 

52.84 

28.31 

12.27 

si, 1 

332.51 

128.51 

32.83 

2.70 


it is possible to work out entirely the perturbation expansion for one-point 
moments from non-Gaussian initial conditions, but the solutions are not exact 
as discussed further below 42 . Consider non-Gaussian initial conditions with 
dimensional scaling. To take into account non-Gaussian terms, one has to 
rewrite Eq. (286) as 

— 1 OO 

Sp{cr L ) = 5] s p,n °l + S p ,o + s£ n <72, (295) 

n=—p +2 n= 1 


where Ol = <7/ is given by linear theory as in Eq. (293). The first non-vanishing 
perturbative contributions to the variance, skewness and kurtosis read [255] 
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42 Some additional results have been recently obtained for the PDF from specific 
type of non-Gaussian initial conditions, see [662], 
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Here the non-Gaussianity in the initial conditions is characterized via the 
dimensionless scaling amplitudes 

/ xp \ 

B r = M4- (297) 

a i 

For non-Gaussian initial conditions seeded by topological defects such as tex¬ 
tures [655,252] or cosmic strings [146,9], B p is expected to be of order unityf 43 ]. 
For reference, Table 8 lists these results for B p = 1 and power-law initial spec¬ 
tra as a function of spectral index n, in this case it is clear that non-linear 
corrections to the linear result, Eq. (292), can be very important even at large 
scales. Even more so, y 2 initial conditions (with spectral index such that it 
reproduces observations) have B 3 « 2.5 and B 4 se 10 [514,689]; therefore 
non-linear corrections are particularly strong [255,565]. 

When compared to exact PT calculations or to measurements in numerical 
simulations, the SC model is seen to provide quite accurate predictions. This 
is illustrated by Fig. 29 for the skewness and kurtosis in texture models [255]. 
These parameters evolve slowly from non-Gaussian initial conditions towards 
the (Gaussian) gravitational predictions. However, even at present time, a 
systematic shift can be observed in Fig. 29 between the Gaussian and the 
non Gaussian case, well described by the SC predictions taken at appropriate 
order. The main signature of non-Gaussianity remains at the largest scales, 
where the S p parameters show a sharp increase: this is the scaling regime of 
Eq. (292) where observations can best constrain non-Gaussianity [594,195]. 
This is explicitly illustrated in Sect. 8 . 

43 For cosmic strings, this statement is valid if the scale considered is sufficiently 
large, R > 1.5(H m /i 2 ) -1 Mpc, see [9] for details. 
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Fig. 29. The skewness and kurtosis, S 3 and S 4 , for texture-like non-Gaussian mod¬ 
els. The triangles show the initial conditions (erg = 0.1), which are fitted well by 
the dimensional scaling, S 3 = B 3 /a and S 4 = B^/a 2 with B 3 = B± ~ 0.5, shown 
as the upper dotted line. Squares show S 3 and S 4 for a later output corresponding 
to ct 8 = 1.0. The SC predictions for the erg = 1 output are shown as short-dashed 
(including the second order contribution) and long-dashed line (including the third 
order). The continuous line shows the corresponding tree-level PT prediction for 
Gaussian initial conditions. The lower dotted lines correspond to the linear the¬ 
ory prediction. In right panel the dot long-dashed line displays the SC prediction 
including the 4th perturbative contribution. From [255]. 

5 .7 Transients from Initial Conditions 

The standard procedure in numerical simulations is to set up the initial per¬ 
turbations, assumed to be Gaussian, by using the Zel’dovich approximation 
(ZA, [705]). This gives a useful prescription to perturb the positions of particles 
from some initial homogeneous pattern (commonly a grid or a “glass” [688]) 
and assign them velocities according to the growing mode in linear perturba¬ 
tion theory. In this way, one can generate fluctuations with any desired power 
spectrum and then numerically evolve them forward in time to the present 
epoch. 

Although the ZA correctly reproduces the linear growing modes of density 
and velocity perturbations, non-linear correlations are known to be inaccurate 
when compared to the exact dynamics [274,355,46,116,356], see also Table 7. 
This implies that it may take a non-negligible amount of time for the exact 
dynamics to establish the correct statistical properties of density and velocity 
fields. This transient behavior affects in greater extent statistical quantities 
which are sensitive to phase correlations of density and velocity fields; by 
contrast, the two-point function, variance, and power spectrum of density 
fluctuations at large scales can be described by linear perturbation theory, 
and are thus unaffected by the incorrect higher-order correlations imposed by 
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the initial conditions. 

In Sect. 2.4.6 we presented the solutions involving the full time dependence 
from arbitrary initial conditions [561]. Again, we assume Q m = 1 for simplicity. 
The recursion relations for PT kernels including transients results from using 
the following ansatz in Eq. (86), 



d 3 k n [<S D ] n ^W(k 1? ..., k n ; ^^(ki) • • • ^(k„), 

(298) 


where a = 1,2, z = In a(r) with a(r) the scale factor, and the n th order 
solutions for density and velocity fields are components of the vector Tj,, i.e. 
d'i n) = S n , ^ 2 n) = e n . In Eq. (298), [5 D ] n = S D (k - k, - ... - k n ). 

The kernels J U G) now depend on time and reduce to the standard ones when 
transients die out, that is — > F n , Fo' 1 ' 1 — > ► G n when z —> oo. Also, Eq. (298) 

incorporates in a convenient way initial conditions, i.e. at z — 0, = 1^\ 

where the kernels X^ a n> describe the initial correlations imposed at the start of 
the simulation. For the ZA we have 


'~r( n ) _ r-iZA / 7"( n ) _ 

X 1 ~ r n X 2 — * 


(299) 


Although most existing initial conditions codes use the ZA prescription to set 
up their initial conditions, there is another prescription to set initial velocities 
suggested in [199], which avoids the high initial velocities that result from 
the use of ZA because of small-scale density fluctuations approaching unity 
when starting a simulation at low redshifts. This procedure corresponds to 
recalculate the velocities from the gravitational potential due to the perturbed 
particle positions, obtained by solving again Poisson equation after particles 
have been displaced according to the ZA. Linear PT is then applied to the 
density held to obtain the velocities, which implies instead that the initial 
velocity held is such that the divergence held Q(x) = 0(x)/(— / 7 i) has the 
same higher-order correlations as the ZA density perturbations. In this case, 

j(n) = F ZA J jin) = F ZA ( 300 ) 

The recursion relations for which solve the non-linear dynamics at arbi¬ 
trary order in PT, can be obtained by replacing Eq. (298) into Eq. (86), which 
yields [561] 


■ ■ ■, K:z) = e-"‘ 9ai (z) 4 n) ( k„ .... k„) 

n—1 z . 

+ J2 ds eniS ~ Z) 9a b (z - s) 7bcd ^ (m) M n - m) ) 
m=1 0 

x^i m) (k 1 ,..., k m ; s) ^ n - m) (k m+1 ,. .., k n ; s), (301) 
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where we have assumed the summation convention over repeated indices, 
which run between 1 and 2. Equation (301) reduces to the standard recursion 
relations for Gaussian initial conditions (Z^ = 0 for n > 1) when transients 
are neglected, i.e. the time dependence of Jis neglected and the lower limit 
of integration is replaced by s = — oo. Also, it is easy to check from Eq. (301) 
that if = ( F n ,G n ), then = ( F n ,G n ), as it should be. Note that PT 
kernels in Eq. (301) are no longer a separable function of wave-vectors and 
time. 

From the recursion relations given by Eq. (301), it is possible to find the 
recursion relations for the smoothed vertices u n and fx n as functions of scale 
factor a and smoothing scale R , and therefore infer the values of the cumulants 
as functions of the 7 p ’s [Eq. (280)] similarly as in Sect. 5.4, but with additional 
dependence with the scale factor. For the skewness parameters, one finds in 
the Einstein-de Sitter case 


Q ( \ _ [4 + 7i] , 
S 3 (a) — -b 
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7i + - 
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(303) 

(304) 

(305) 


where we have assumed ZA initial velocities. On the other hand, for initial 
velocities set from perturbed particle positions, we have: 


S 3 (a) 


T 3 (a) 


a + l 7 + ' 1 / a 35a 7 / 2 5 
34 2 16 

Y + 71 ” 5a ~ 35a 7 / 2 ’ 

_[4 + t 1 1_ ( 26 7 Ti + f 

a I 7 a 35a 7 / 2 ’ 

26 2 24 

7 7i + y _ 35a 7 / 2 ■ 


(306) 

(307) 

(308) 

(309) 


For Q m 7 b these expressions are approximately valid upon replacing the 
scale factor a by the linear growth factor D\{t). The first term in square 
brackets in Eqs. (302) and (304) represents the initial skewness given by the 
ZA (e.g. [46]), which decays with the expansion as a -1 , as expected from 
the discussion on non-Gaussian initial conditions in the previous section. The 
second and remaining terms in Eqs. (302) and (304) represent the asymp¬ 
totic exact values (in between braces) and the transient induced by the exact 
dynamics respectively; their sum vanishes at a — 1 where the only correla¬ 
tions are those imposed by the initial conditions. Similar results to these are 
obtained for higher-order moments, we refer the reader to [561] for explicit 
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Fig. 30. The ratio of the tree-level S p parameters at scale factor a to their asymptotic 
exact dynamics value for scale-free initial spectra with spectral indices n = —1,0. 
From top to bottom p = 3,..., 8. The values at a = 1 represent those set by the 
ZA initial conditions. 

expressions. Note that for scale-free initial conditions, the transient contribu¬ 
tions to S p and T p break self-similarity. Transients turn out to be somewhat 
less important for velocities set from perturbed particle positions, than in the 
ZA prescription, as in this case higher-order correlations are closer to those in 
the exact dynamics. 

Figure 30 illustrates these results for the skewness and higher-order S p param¬ 
eters as functions of scale factor a for different spectral indices, assuming that 
velocities are set as in the ZA. The plots show the ratio of S p (a ) to its “true” 
asymptotic value predicted by PT, S p ( oo), for 3 < p < 8 . The values at a = 1 
correspond to the ratio of ZA to exact dynamics S p s, which becomes smaller 
as either p or n increases. For the skewness, it takes as much as a = 6 for n — 0 
to achieve 10% of the asymptotic exact PT value, whereas spectra with more 
large-scale power, where the ZA works better, require less expansion factors 
to yield the same accuracy. As p increases, however, the transients become 
worse and at p = 8 an expansion by a factor a = 40 is required for n = 0 to 
achieve 10% accuracy in S%. This suggests that the tails of the PDF could be 
quite affected by transients from initial conditions. 

Figure 31 presents a comparison of the perturbative predictions for transients 
in S p parameters with the standard CDM numerical simulations measurements 
of [28]. In this case, initial velocities are set as in [199] rather than using the 
ZA. The error bars in the measurements correspond to the variance over 10 
realizations. If there were no transients and no other sources of systematic un¬ 
certainties, all the curves would approach unity at large scales, where tree-level 
PT applies. Unfortunately, there are other sources of systematic uncertainties 
which prevents a clean test of the transients predictions from PT, as we now 
briefly discuss, but more details will be given in Sect. 6 . 12 . 
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Fig. 31. Symbols show the ratio of the S p parameters for different scale factor a 
(simulation began at a = 1) measured in SCDM numerical simulations [28] to their 
asymptotic tree-level exact dynamics value as a function of smoothing scale R. Sym¬ 
bols represent a = 1 (open triangles), a = 1.66 (filled triangles), a = 2.75 (open 
squares) and a = 4.2 (filled squares). Error bars denote the variance of measure¬ 
ments in 10 realizations. Solid lines correspond to the predictions of transients in 
tree-level PT, expected to be valid at large scales. 

The different symbols correspond to different outputs of the simulation: open 
triangles denote initial conditions (a = 1, (r 8 = 0.24), solid triangles (a = 
1.66, erg = 0.40), open squares (a = 2.75, cr$ = 0.66), and solid squares 
(a = 4.2, 0 ^ = 1.0). For the initial conditions measurements (open triangles) 
there is some disagreement with the ZA predictions, especially at small scales, 
due to discreteness effects, which have not been corrected for. The initial 
particle arrangement is a grid, therefore the Poisson model commonly used to 
correct for discreteness is not necessarily a good approximation (see [28] for 
further discussion of this point and Sect. 6.12.2 below). The second output time 
(solid triangles) is perhaps the best for testing the predictions of transients: 
discreteness corrections become much smaller due to evolution away from the 
initial conditions, and the system has not yet evolved long enough so that 
finite volume corrections are important (see also Sect. 6.12.1). For S 3 we see 
excellent agreement with the predictions of Eq. (307), with a small excess at 
small scales due to non-linear evolution away from the tree-level prediction. For 
p > 3 the numerical results show a similar behavior with increased deviation 
at small scales due to non-linear evolution, as expected. For the last two 
outputs we see a further increase of non-linear effects at small scales, then a 
reasonable agreement with the transients predictions, and lastly a decrease of 
the numerical results compared to the PT predictions at large scales due to 
finite volume effects, which increase with 0 - 8 , R, and p [147,28,150,472], 
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5.8 The Density PDF 


Up to now, we have given exhaustive results on the local density moments. 
In the following we show how these results can be used to reconstruct the 
one-point density PDF’s [44], 


5.8.1 Reconstruction of the PDF from the Generating Function 
We use here the relation between the probability distribution function and 
the generating function <p(y), Eq. (142). To be able to use such a relation 
one needs a supplementary non-trivial hypothesis. Indeed <p(y) is a priori o 
dependent through every S p parameter. We assume here that we have 

<p(y, v) <p(y) when (7 > 0, (310) 


in an uniform way as suggested by numerical simulation results on S p . No 
proof has however been given of such a property. It has even been challenged 
recently by calculations presented in [661,663], which suggest that (p(y,a) 
is not analytic at y —» 0“ for finite values of cr. That would affect results 
presented below (in particular the shape of the large density tails). In the 
following we will ignore these subtleties and assume that, when the variance 
is small enough, it is legitimate to compute the density PDF from, 


+ioo 


p(8)d8 = 


d y 


27ri(T 2 


exp 


<p(y) , yS 


a 2 + a 2 


dS 


(311) 


where <p(y) is given by the system (262,263) by analytic continuation from the 
point <£>(0) = 0. 

From this equation numerous results can be obtained. The different forms of 
p(8) have been described in detail in [16,17]. Taking advantage of the approx¬ 
imation d«l one can apply the saddle-point approximation to get 


p(5)d5 


d 8 

-Gs(t) 
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tGs(t)/Gs{t) 


n 1/2 


2na 2 


exp 



Gs(t) = 8. (312) 


This solution is valid when 5 < S c where 8 C is the value of the density con¬ 
trast for which 1 = tGs{t)/G's(t). Here function Gs{t) is equal to Gs( T ) or 
Gf (t) whether one works in Lagrangian space or Eulerian space while taking 
smoothing into account (Sect. 5.4.3). 

When 8 is larger than 8 C the saddle point approximation is no longer valid. The 
shape of p(8) is then determined by the behavior of <p(y) near its singularity 
on the real axis, 

<p(y) -Ps + r s (y - y s ) - a s (y - 2/ s ) 3/2 , (313) 
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Table 9 

Parameters of the singularity (313) for different values of the spectral index n (there 
is no singularity for n > 0). 


n 

5 C 

Vs 

r s 

d s 

Vs 

-3 

0.656 

-0.184 

1.66 

1.84 

-0.030 

- 2.5 

0.804 

-0.213 

1.80 

2.21 

-0.041 

-2 

1.034 

-0.253 

2.03 

2.81 

-0.058 

-1.5 

1.44 

-0.310 

2.44 

3.93 

-0.093 

-1 

2.344 

-0.401 

3.34 

6.68 

-0.172 

-0.5 

5.632 

-0.574 

6.63 

18.94 

-0.434 


and we have 


p(<5)d<5 


3 a s a 

4 


(1 + 5-r s )~ 5 ' 2 


exp 


-\y s \5/a 2 +\ip s \/a ‘ 2 


dS. 


(314) 


Table 9 gives the parameters describing the singularity corresponding to differ¬ 
ent values of the spectral index, for the PDF of the smoothed density field in 
Eulerian spacef**]. One sees that the shape of the cut-off is very different from 
that of a Gaussian distribution. This shape is due to the analytic properties of 
the generating function ip(y) on the real axis. We explicitly assume here that 
the Eq. (310) is valid, in particular that the position of the first singularity 
is at finite distance from the origin when a is finite. It has been pointed out 
in [663] that the equation (263) admits a second branch for y s < y < o which 
cannot ignored in the computation of the density PDF for finite values of a. 
In practice its effect is modest. It however affects the analytical properties of 
<p(y) and therefore the shape of the large density tail, Eq. (314). 

Numerically it is always possible to integrate Eq. (311) without using the 
saddle-point approximation. It is then useful to take advantage of the weak 
D m and Da dependence of the vertex generating function. In particular one 
can use 

Gs L (t)=( 1 + T ) -1, (315) 


which is the exact result for the spherical collapse dynamics when D m —> 0, 
Da = 0. This leads to slight over-estimation of the low-order vertex [in this 
case S 3 = 5 — (n + 3) for instance] but the power-law behavior at large r 
is correctly reproduced. For this and for a power-law spectrum r can be 
explicitly written in term of Qf. It is interesting to note that for n = 0 there 

11 The case n = —3 corresponds as well to the PDF in Lagrangian space or to the 
unsmoothed case. 
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Fig. 32. Comparison between predictions of tree-level PT with results of iV-body 
simulations in the standard CDM model [predictions were calculated assuming 
Eq. (315)]. From [44]. 

is no singularity, the saddle point approximation reduces to Eq. (312) and the 
Eulerian PDF of the smoothed density held reads, 


P„=o(<5)di = vF + sy^ 3 + (1 + ■5)- 7/3 


91 

((l + 5) 2 / 3 -l) 

21 

1 

d 5 

8(1 + 5) l Co 2 



(316) 


One can also obtain the PDF from the SC model using the local lagrangian 
mapping [256,554], The PDF’s that are obtained are in good agreement with 
results of numerical simulations. In Fig. 32, PT predictions for different smooth¬ 
ing scales are compared to measurements in a P 3 M simulation for the standard 
CDM model. The predicted shape for the PDF (computed from the measured 
variance and known linear spectral index) is in remarkable agreement with the 
IV-body results. 


5.8.2 Dependence on Cosmological Parameters 

The dependence of the shape of the PDF on cosmological parameters is entirely 
contained in the spherical collapse dynamics when the density held is expressed 
in terms of the linear density contrast. It can be examined for instance in 
terms of the position of the critical density contrast, <5 C . The variation of 5 C 
with cosmology is rather modest as shown in Fig. 33 for Da = 0. This results 
applies also to the overall shape of Q$ (see [44,45]), for which the dependence 
on cosmological parameters remains extremely weak, at percent level. This 
extends what has be found explicitly for the S 3 and S 4 parameters. 


113 





Fig. 33. Variation of the position of the critical (linear) value for the density contrast 
as a function of f l m for open cosmologies. 

5.8.3 The PDF in the Zel’dovich Approximation 

For approximate dynamics such as ZA the previous construction can also be 
done. It follows exactly the same scheme and the tree-order cumulant gen¬ 
erating function can be obtained through the ZA spherical collapse dynam¬ 
ics [471,49]Q It is given by 

gf' = (i - £) 3 • (3i7) 

One could then compute the Laplace inverse transform of the cumulant gen¬ 
erating function to get the one-point density PDF. As in the previous case, 
this result is not exact in the sense that it is based on the leading order result 
for the cumulants. 

In case of the ZA it is actually possible to do an a priori much more ac¬ 
curate calculation with a direct approach. Indeed, the local density contrast 
neglecting filtering effects is given by the inverse Jacobian of the deformation 
tensor, Eq. (93), and the joint PDF of the eigenvalues can then be explicitly 
calculated [190] 


p(Ai, A 2 , A 3 ) — 


5 5 / 2 27 

87rcr 6 


x exp 


(A3 — Ai)(A 3 — A 2 )(A 2 — Ai) 

15 


— 3 (Ai + A 2 + A 3 ) —— (AiA 2 + AiA 3 + A 2 A 3 ) 


/* S 


(318) 


where we have assumed that Ai < A 2 < A 3 . From this it is possible to compute 
the shape of the one-point density PDF [382,49], 


45 Extension to other non-linear approximations discussed in Sect. 2.8 is considered 
as well in [471]. In addition, recent works have focussed on the PDF generated by 
second-order PT [644,682]; however, these neglect the effects of smoothing. 
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, (320) 


where N s is the mean number of streams; N s — 1 in the single stream regime. 
The above prediction for the PDF is however of limited value because, in the 
absence of smoothing, there is an accumulation of density values at infinity. 
This is due to the fact that there is always a finite probability of forming 
caustics (where the Jacobian vanishes). An unfortunate consequence of this 
is that the moments of this distribution are always infinite! This does not, 
however, contradict the results given in Sect. 5.5 as shown in [49]: when a cut¬ 
off is applied to the large density tail, the moments remain finite, and behave 
as expected from the PT calculations. This has been explicitly verified up to 
one-loop order [557]. 


5.9 Two-Dimensional Dynamics 


The case of gravitational instability in two spatial dimensions (2D) might 
be viewed as quite academic, ft is however worth investigating for different 
reasons: (i) it is a good illustration of the general method; (ii) numerical 
simulations in 2D dynamics can be done with a much larger dynamical range 
than in 3D; and, perhaps most importantly, (iii) the 2D results turn out to be 
of direct use to study statistical properties of the projected density (Sect. 7.2), 
relevant for observations of angular clustering and weak gravitational lensing. 
The dynamics we are interested in corresponds actually to density fluctuations 
embedded in a 3D space but which are uniform along one direction. The 
general equations of motion are left unchanged; here, we consider again only 
the Einstein-de Sitter case. 

Let us review the different stages of the calculation [48]. For the naked ver¬ 
tices, without smoothing effects, the only change introduced is due to the 
cos 2 (ki,k 2 ) factor that in 2D averages to 1/2 instead of 1/3. The resulting 
recursion relations between the vertices u n and /i n then read, 
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(321) 

(322) 


instead of Eqs. (50) and (51). No simple solution for the generating function of 
u n , Q$ d (t), is known although it again corresponds to the equation describing 
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the “spherical” collapse in 2D[ ro ~|. It can however be shown that Q^ D [r) — 1 ~ 
r -(vT3 —i)/2 w j ien T _ ^ qq^ and expression 




with v 


Vl3-1 

2 


(323) 


provides a good fit. More precisely one can rigorously calculate the expansion 
of G s (r) near r = 0 and it reads 



12 2 29 o 79 4 

-r H- t 2 - t s H - r 4 

14 42 147 


2085 , 

-r 5 + 
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(324) 


The resulting values for the S' 20 parameters when smoothing is neglected are 
S 2D = 36/7, S 2D = 2540/49, S 2D = 793, S 2D = 13370. When filtering is taken 
into account the vertex generating function becomcsP 7 ], 
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for a power-law spectrum of index n. This leads to [48] 
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Obviously, these results can also be obtained from a direct perturbative cal¬ 
culation using the geometrical properties of the 2D top-hat window function 
given in Appendix C. The position and shape of the singularity is also changed 
in 2D dynamics. In Table 10 we give the parameters of the singularity in (p(y). 


5.10 The Velocity Divergence PDF 

So far our description has been focussed on the density held. The structure of 
the equations for the velocity divergence is the same as for the local density. 
We briefly account here for the results that have been obtained at tree level 

46 To our knowledge there is no closed analytical solution for the 2D spherical 
collapse. 

47 In 2D dynamics if P(k) ~ k n then a(R) oc R~V+ 2 ). 
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Table 10 

Parameters of the singularity, Eq. (313), for the 2D case. There is no singularity for 
n > 0 . 


n 

Vs 

<Ps 

r s 

CL S 

-2 

-0.172 

-0.197 

1.60 

-1.72 

-1.5 

-0.212 

-0.252 

1.81 

-2.25 

-1 

-0.277 

-0.350 

2.23 

-3.41 

-0.5 

-0.403 

-0.581 

3.55 

-7.73 


for the velocity divergence [44]. Loop corrections with exact PT are discussed 
in e.g. [557]. Note that the SC model approximation described in Sect. 5.5.2 
does not do as well as for the density contrast, due to tidal contributions^ 8 ], 
but can provide again approximate loop corrections for the cumulants while 
still giving exact tree-level results [223]. 


5.10.1 The Velocity Divergence Cumulants Hierarchy 

In what follows, we assume that the velocity divergence is expressed in units of 
the conformal expansion rate, Tt = aH. For convenience, we define the vertex 
generating function for the velocity divergence as 

e,(r) = Izih. (330) 

P! jS p! 

This def ini tion corresponds to slightly different vertices from those given by 
Eq. (49), 

n p =(e^[5^Y) c /{[5^} 2 Y. (331) 


When the filtering effect is not taken into account the vertex generating func¬ 
tion can be obtained from the one of the density held. From the continuity 
equation we have [43], 


Ge(a,r) = 


0 y-^( fl ) r ) + /(^m, CL K )r^-Qs{a, r) 
da dr 


[1 + G s {a,T)\ 


-1 


(332) 


One can use the fact that function Gs(a,r) is nearly insensitive to the values 
of Q m and Oa to obtain a simplified form for the function Gd(a, r), 


Ge(r) « -f{Vt r 


)T — G, 


(r)/ 1 + Gs(t) 


( 333 ) 


48 Velocities are more affected by previrialization effects, as shown in Fig. 12. 
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so that Geij) ~ f(Q m , ^a)t (1 + 2r/3) _1 if approximation in Eq. (315) is used. 
This in fact fully justifies the definition of the vertices p p which are seen to 
be almost independent of the cosmological parameters, as already discussed 
in Sect. 2.4.3. 

From now on, we use again for clarity the Lagrangian and Eulerian super¬ 
scripts, in particular Gg = Go, Gs = Gs ■ Including filtering effects requires 
taking into account the mapping from Lagrangian to Eulerian space, as ex¬ 
plained in Sect. 5.4.3. As a consequence of this we have 



si 


° ([1 + Gf (r)p 3 ii) 

a(Fl) 


(334) 


which amounts to say that the velocity divergence should be calculated at the 
correct mass scale. This remapping does not further complicate the depen¬ 
dence on cosmological parameters: Gg{j)/f(GL m ,GL\) remains independent of 
(f2 m , 11 a) to a very good accuracy. 

It is possible to derive the cumulants T p from the implicit Eq. (334), rely¬ 
ing on the usual relations given in Sect. 5.4.1 between the the cumulants 
and what would be the genuine intrinsic velocity divergence vertices, p“ tr = 
(0W [6^] p ) C)E /{{O^YY e that are straightforwardly related to fl p through /ip ntr = 
a)] _ p - The corresponding vertex generating function, ^ ntr (r), is 
given by ^ ntr (T) = Go [— /(I2 m , fhOr] together with Eqs. (260), (262) and 
(263), and replacing S p with T p and Gs with Gg ntr , can be used to compute the 
velocity divergence cumulant parameters. For an Einstein-de Sitter universe, 
the first two read 


Tz (hl m — 1, — 0) — — + 71^ , 

r 4 (ji ra = i,n A = o) = L2?? + + Irh + 272 


441 


21 


3 ’ 


(335) 

(336) 


where the parameters are given by Eq. (280). Furthermore, the dependence 
on cosmological parameters is straightforwardly given byp 5 ] 

T p {VL m ,VL a) ~ = = h)> (337) 

which implies a relatively strong O m dependence for the shape of p(9) as we 
now discuss. 


19 To be compared, for example, to the more accurate result given for T 3 in Eq. (245). 
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5.10.2 The Shape of the PDF 

The above line of arguments provides a general rule for the dependence of the 
PDF on cosmological parameters: 


p[/(D m ,D A ),6»,a e ] dO 


P 


0 


ee 


’ /(D m ,D A )’ /(D m ,D A ) 


d 0 


a) 


(338) 


Otherwise, the PDF can be calculated exactly the same way as for the density 
contrast. 

The case n — — 1 is worth further investigations since it is then possible to 
derive a closed form that fits extremely well the exact numerical integration, 
similarly as for the PDF of 5 for n = 0. This approximation is based on the 
approximate form in Eq. (315) for the function Qf. With n = — 1 it leads to 


s->E,n=— 1 

b<5 




One can then show that 


Go ' n= ~ 1 


( r ) 


m m ,n A ) 


T 



1/2 



(339) 


(340) 


The calculation of the PDF of the velocity divergence from the saddle-point 
approximation [e.g. Eq. (312)] then leads to the expression, 


p(6)d0 


([2k-1]/k 1 / 2 + [A - lj/A 1 / 2 )- 3 / 2 

k3/ 4 (2tt)i/ 2 a e 6XP 


9 2 ' 
2 A a 2 e _ 


d 0 , 


(341) 


with 

0 2 20 

K ~ 1 + 9A/(0 m ,0 A ) 2 ’ “ 3/(Q m ,Q A )’ 

where 0 is expressed in units of the conformal expansion rate, hi. 


(342) 


5.10.3 Comparison with N-Body Simulations 

Measurements in numerical simulations turn out to be much more non-trivial 
for the velocity field than for the density field. The reason is that in IV-body 
simulations, the density field is traced by a Poisson realization. Although it 
suffices to count points, in grid cells for instance, to get the filtered densityp n ~|, 
the velocity field is only known in a non-uniform way where particles happen 
to be. Therefore, simple averages of velocities do not lead to good estimations 
of the statistical properties one is interested in, especially when the number 
density of particles is small. 

50 Corrected for discreteness effects using factorial moments as discussed in Sect. 6.7. 
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Fig. 34. The PDF of the velocity divergence for two different values of Q m (Q m = 1, 
left panel and = 0.2, right panel). The dotted lines correspond to the ap¬ 
proximate analytic fit [Eq. (341)] and the solid lines to the theoretical predictions 
obtained from a direct numerical integration of the inverse Laplace transform with 
n = —0.7. In right panel the dashed line is the prediction for = 1 and the same 
a 0 ~ 0.4. From [54], 

For this purpose specific methods have been developed to deal with veloc¬ 
ity held statistics [52], The idea is to use tessellations to obtain a continuous 
description of the velocity held; two alternative prescriptions have been pro¬ 
posed. One makes use of the Voronoi tessellation; in this case the velocity is 
assumed to be uniform within each Voronoi cell, in other words, the local ve¬ 
locity at any space point is the one of the closest particle. The second method 
makes use of the Delaunay tessellation. In this case the local velocity is as¬ 
sumed to vary linearly within each Delaunay tetrahedron (such ensemble of 
tetrahedra forms a unique partition of space); the local velocity is then defined 
by a linear combination of its closest neighbors, see [52,54] for details. 

These methods have been applied to results of numerical simulations [54,387]. 
Comparisons between theoretical predictions, in particular the form (341), 
and the measurements are shown in Fig. 34. The simulation used here is a PM 
simulation with a scale-free spectrum with n = —1. The prediction, Eq. (341), 
gives a good account of the shape of the divergence PDF, especially in the tails. 
The detailed behavior of the PDF near its maximum requires a more exact 
computation. We obtained it here by an exact inverse Laplace computation 
using Eq. (315) for the density vertex generating function [and Eq. (333)] to get 
the velocity vertices. Because this expression does not accurately predict the 
low-order cumulants[ Trr | the integration has been made with n = —0.7,instead 
of 7i = —1, to compensate for this problem. The agreement with simulations 
is quite remarkable. 


51 For example, T 3 = 4 — (71 + 3) instead of T 3 = 26/7 — (n + 3). 
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local divergence 


Fig. 35. Example of a joint PDF of the density and the velocity divergence. The color 
is in logarithmic scale, the smoothing scale is 15 Mpc/h, the spectrum is scale-free 
with n = —1.5, and as = 1, see [56] for details. 

5.11 The Velocity-Density Relation 

PT also allows one to consider multivariate PDF’s such as the joint distri¬ 
bution of the local density contrast and the local divergence 9. An example 
of such PDF is shown in Fig. 35. It illustrates in particular the fact that the 
local density and local divergence do not follow in general a one to one corre¬ 
spondence, as it would be the case in linear perturbation theory. Deviations 
from this regime induce not only a nonlinear relation between 5 and 9, i.e. a 
bending in the 5-9 relation, but also a significant scatter. 

In general the statistical properties of these two fields can be studied through 
their joint cumulants, ( 5 p 9 q ) c . Similarly to cases involving only one variable it 
is possible to compute such quantities at leading order, or at next to leading 
order (involving loop corrections) in PT. One can define the parameters U pq 
as, 


(W) c = U pq (5y + «-\ (343) 

where 9 is expressed in units of the conformal expansion rate, Ti. The U vq s 
are finite (and non-zero) at large scales for Gaussian initial conditions and can 
be easily computed at tree order. Their calculation follows a tree construction 
from the vertices u p and fi q . For instance, one obtains 
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U\ 1 Fl /^1 fJ> 1 f (^ 7 ni ^a) i 

U 2 i — 2v 2 jh + /i 2 , 

U 31 — 3 z / 3 fl 1 + /i 3 + + 6 n 2 ^ 2 , 

JJ 2 2 = 2z/ 3/ U^ + 2/i 3 /i 1 + 8 z^2/^2/^1 + 2.1^2 fl\ + 2/i2- 

with /ip = -f(£L m , )/V 

These expressions are straightforward when the smoothing effects are not 
taken into account. They are still true otherwise, but they rely on the fact 
that the same mapping applies to the density and the velocity divergence. 
More generally it is possible to derive explicitly the generating function of the 
joint cumulants. The demonstration is presented in Appendix. B.2. 

An interesting application of these results is the computation of the joint 
density-velocity PDF. Assuming that the leading order contributions to cu¬ 
mulants provide a reliable description, we have 


+ioo +ioo 

/ £ ) gjexp 


8 Vi , 9 y 2 </?(//!, ?/ 2 ) 


+ 


<7 


<7- 


(7 


( 344 ) 


<p{yi,y2) = yiGs(T) +y2Ge(r) - - ^y2r-^Ge(r), 

r = -yi-^-Gs(r) - y2j~Ge(r)i 

dr dr 


where a 2 is the variance of the density held. 

As a consequence of this relation one can compute constrained averages such 
as the expectation value of 6 under the constraint that the local density is 
known, (9)s- For a vanishing variance (that is, at tree level) the result turns 
out to be extremely simple and reads [42], 


{0)s = Ge(r), with Qs(j) = 5. 


( 345 ) 


This relation can obviously be inverted to get (S)g. It is interesting to note 
that this result is not quantitatively changed when top-hat smoothing effects 
are taken into account (nor it depends on the shape of the power spectrum), 
which is not true anymore with Gaussian smoothing [125]. 

A more pedestrian approach should be used when the variance is not negligible: 

(d)e = Oo T cq 9 + a 2 + a 3 9 3 + ... (346) 

(9)s = r 0 + ri8 + r 2 5 2 + r 3 8 3 + ... (347) 

Computations should be made order by order and it becomes inevitable to 
introduce next-to-leading order corrections, i.e. loop corrections. 

The coefficients ao,..., a 3 and ro,— . r 3 have been computed explicitly up to 
third-order in PT [125,127,56]. It is to be noted that at leading order one has 
a 0 = —a 2 <7g and r 0 = — r 2 n 2 to ensure that the global ensemble average of 9 
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Table 11 

The coefficients a \,..., 03 and r±,..., r 3 as functions of the spectral index n for 
scale-free power spectra and Gaussian smoothing. Results are given at leading order, 
except for ai and r\ for which one-loop corrections are included when available 
(correction is infinite for n > — 1 ). 


index n 

ai 

a 2 

03 

r\ 

r2 

^3 

-3.0 

- 

0.190 

-0.0101 

1+0.3 a 2 

-0.190 

0.0826 

-2.5 

- 

0.192 

-0.00935 

1+0.202 a 2 

-0.192 

0.0822 

-2.0 

1-0.172 a 2 g 

0.196 

-0.00548 

1+0.077 a 2 

-0.196 

0.0821 

-1.5 

l+0.187cr| 

0.203 

-0.000127 

1-0.296 cr 2 

-0.203 

0.0822 

-1.0 

1 + [ 00 ] 

0.213 

0.00713 

1 + [ 00 ] 

-0.213 

0.0835 

-0.5 

1 + [ 00 ] 

0.227 

0.0165 

1 + [ 00 ] 

-0.227 

0.0865 

0 

1 + [ 00 ] 

0.246 

0.0279 

1 + [ 00 ] 

-0.246 

0.0928 

0.5 

1 + [ 00 ] 

0.270 

0.0408 

1 + [ 00 ] 

-0.270 

0.1051 

1.0 

1 + [ 00 ] 

0.301 

0.0532 

1 + [ 00 ] 

-0.301 

0.1283 


and 5 vanish. Note also that the third-order PT results for ai and r\ involve 
a loop correction that diverges for n > — 1. The known results are given in 
Table 11 for the Einstcin-de Sitter case and Gaussian smoothing. 

The Q m dependence of these coefficients can be explicitly derived. For instance, 
the coefficient r 2 can be expressed in terms of the skewness of the two fields (at 
leading order only), which leads to r 2 = /(G m , Q^)(S 3 -hf(Q m , Ga)T 3 )/6. For a 
top-hat filter, r 2 is always given by /(G m , 12 a) 4/21 and, for a Gaussian window 
it varies slightly with the power spectrum index but shows a similarly strong 
/(f2 m , Ga) (and therefore f2 m ) dependence. Comparisons with numerical sim¬ 
ulations have demonstrated the accuracy and robustness of these predictions 
(except for the loop terms) [56,387]. 

Such results are of obvious observational interest, since one can in principle 
measure the value of Vt m from velocity-density comparisons, see [179]. In par¬ 
ticular a detailed analysis of the curvature in the 5 — 6 relation (through a 2 or 
r 2 ) would provide a way to break the degeneracy between biasing parameters 
(Sect. 7.1) and Q m [128,56][^]. Moreover, these results can be extended to take 
into account redshift distortion effects (Sect. 7.4) as described in [129]. The 
main practical issue is that current velocity surveys are not sufficiently large 
to provide accurate density-velocity comparisons going beyond linear PT. 

It is finally worth noting that these investigations are also useful for detailed 
analysis of the Lyman-a forest [483]. 


52 The scatter in this relation seen in Fig. 35 can be reduced by including also 
off-diagonal components of the velocity deformation tensor [273,429,126]. 
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Fig. 36. Structure of the coefficient C pq in large separation limit: C pq is given by 
the sum of all possible trees joining p points in first cell to q points in the second 
with only one crossing line. The sums can be done separately on each side leading 
to C p q = C p 1 Cq 1 . 

5.12 The Two-Point Density PDF 


Perturbation theory can obviously be applied to any combination of the den¬ 
sity taken at different locations. In particular, for sound cosmic error com¬ 
putations (see Chapter 6) the bivariate density distribution is an important 
quantity that has been investigated in some detail. 

The object of this sub-section is to present the exact results that have been 
obtained at tree-level for the two-point density cumulants [51]. We consider the 
joint densities at positions xi and X 2 and we are interested in computing the 
cumulants (h p (x 1 )h q (x 2 )) c where the field is supposed to be filtered at a given 
scale R. In general such cumulants are expected to have quite complicated 
expressions, depending on both the smoothing length R and the distance |xi — 
X 2 I. We make here the approximation that the distance between the two points 
is large compared to the smoothing scale. In other words, we neglect short- 
distance effects. 

Let us define the parameters C pq with, 

c _ 

(<S(xi).5(x 2 )) 

Because of the tree structure of the correlation hierarchy, we expect the coef¬ 
ficients C pq to be finite in both the large distance limit and at leading order in 
the variance. This expresses the fact that among all the diagrams that connect 
the two cells, the ones that involve only one line between the cells are expected 
to be dominant in cases when (5(xi)5(x 2 )) <C ( S 2 ). 

The next remarkable property is directly due to the tree structure of the high- 
order correlation functions. The coefficients C pq are dimensionless quantities, 
that correspond to some geometrical averages of trees. It is quite easy to realize 
(see Fig. 36) that such averages can be factorized into two parts, corresponding 
to the end points of the line joining the two cells. In other words one should 
have, 

C pq = C pl C ql . (349) 

This factorization property is specific to tree structures. It was encountered 


(348) 
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originally in previous work in the fully non-linear regime [40]. It has specific 
consequences on the behavior of the two-point density PDF, namely we expect 
that, 


P [p(xi),p(x 2 )] =p\p(x 1 )\ p [p(x 2 )] (1 + 6[p(x 1 )](5(x 1 )5(x 2 ))6[p(x 2 )]). 

(350) 

The joint density PDF is thus entirely determined by the shape of the “bias” 
function, fefpjf 53 ]. 

The general computation of the C p \ series is not straightforward, although 
the tree structure of the cumulants is indicative of a solution. Indeed the 
generating function r/>( y ) of C p i, 

°o n 

i>(v) = T.C„ A, (351) 

p =1 P- 


corresponds to the generating function of the diagrams with one external line. 
For exact trees this would be r(p). However, the Lagrangian to Eulerian map¬ 
ping affects the relation between <p(p) and r(y) and this should be taken into 
account. We give here the final expression of ^(p), derived in detail in [51], 


<P(y) = T (v) 


g(fl) 

°(R{ i + 6f] 1/3 ) 


(352) 


where r(p) is solution of the implicit Eq. (263). A formal expansion of i()(y ) 
with respect to y gives the explicit form of the first few coefficients C p \. They 
can be expressed in terms of the successive logarithmic derivatives of the 
variance, y, [Eq. (280)], 


_ 68 7i 

C21 “2l + Y’ 


C 31 = 
C 41 = 


11710 61 


441 7 

107906224 


2 9 y 9 

+ 3 71 + Y’ 
90452 7l 116 71 2 


305613 

20 7172 


441 


9 


+ 


2 73 


V 75872 
3 63 


(353) 

(354) 


(355) 


These numbers provide a set of correlators that describe the joint density 
distribution in the weakly nonlinear regime. They generalize the result found 
initially in [231] for C 21 . Numerical investigations (e.g. [51]) have shown that 
the large separation approximation is very accurate even when the cells are 
quite close to each other. 

For a comparison of the above results with N-body simulations and the spher¬ 
ical collapse model see [263]. 

53 The interpretation of this function as a bias function is discussed in Sect. 7.1.2. 
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* * 

Fig. 37. The cumulants S p in the rCDM model as functions of £ = a 2 , for p = 3,4 
and 5 (with respectively triangles, squares and pentagons) compared to tree order 
PT predictions assuming a local power spectrum (dots), taking into account spectral 
index variation, i.e. corrections 7 p , p > 2 in Eqs. (281-284) (long dashes on right 
panel), EPT where n e ff is inferred from the measured S 3 (short dashes) and one loop 
perturbation theory predictions based on the spherical model (dots-long dashes on 
left panel). From [153]. 

5.13 Extended Perturbation Theories 

The range of validity of perturbation theory results suggests that they provide, 
on a sole phenomenological basis, a robust model for describing the correlation 
hierarchy in all regimes. In the Extended Perturbation Theory (EPT) ansatz, 
the S p s are assumed to be given by Eqs. (281-284) with 71 = —{n + 3) and 
7 $ = 0, i > 2, where n = n p (a) is an adjustable parameter inferred from the 
measured value of S p as a function of the measured variance a 2 : 

S p [n = n p (a)} = S— ed (a). (356) 

As observed in [151], for scale-free initial conditions, the function n p (c r) does 
not depend on cumulant order p to a very good approximation: 

n p (a) ~ n e s(a) (357) 

in any regime, from very smalf 3 ^ value of a to a very large value of a. A simple 
form has been proposed to account for these results [151], 

CC^~ 

U eS = U+ (n non li n e a r - n) - -- (358) 

X T + X~ T 

x = exp[log 10 (cr 2 /erg)]. 

54 Of course, in this regime n e g = n, where n is the linear spectral index. 
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Table 12 

Parameters used in fit (358). 


n 

^nonlinear 

Ti 

nonlinear 

TL^~ 

nonlinear 

oo 

T 

-2 

-9.5 

-12.4 

-7.22 

1.6 

1.4 

-1 

-3 

-3.8 

-2.24 

1.4 

1.2 

0 

-1.2 

-1.6 

-0.86 

1.25 

0.6 

+1 

-0.85 

-1.17 

-0.57 

0.7 

0.3 


where n e g is varying from the value of the initial power spectrum index, n, to a 
value corresponding to the stable clustering regime, n non imear- The location and 
the width of the transition between these two regimes depend on the initial 
power spectrum index and are described respectively by cr 0 and r. Values of 
the parameters involved in Eq. (358) are listed in Table 12 for n ranging from 
—2 to 1. These values can be approximately obtained by the following fitting 


formulae valid for n < — 1 


An — 1) 

^nonlinear \P') — 3 . . , 

(o + n) 

(359) 

r(n) ~ 0.8 — 0.3 n, 

(360) 

log 10 cr ( j (n) ~ 0.2 — 0.1 n. 

(361) 


Equation (359) is in good agreement with measurements of the bispectrum [234] 
in IV-body simulations as well as predictions from HEPT (Sect. 4.5.6). For a 
realistic, scale dependent spectral index (such as CDM models), the situation 
becomes slightly more complicated since Eq. (357) is in principle not valid 
anymore, at least in the weakly nonlinear regime, due to the corrections 
in Eqs. (281-284), which should be taken into account. However, these cor¬ 
rections are in practice quite small [44,28,153] an can be neglected in a first 
approximation as illustrated by the right panel of Fig. 37. Then, Eq. (357) 
extends as well to non-scale-free spectra such as CDM models [151,153,629] 
(see Fig. 37). 

It is even possible to use scale-free power spectra results, Eq. (358), with 
appropriate choice of n in Eqs. (359-361), n = —'y 1 (R) — 3 obtained from 
the linear variance computed at smoothing scale R , to obtain an approximate 
fit of function n e s(cr) [151]. It is worth noting as well that EPT is a good 
approximation for the S p ’s measured in 2D galaxy catalogs, with n e fr varying 
from approximately —2 to —5 depending on the angular scale considered [622], 
This description can be extended to the joint moments [623], giving the so- 
called E 2 PT framework [630,153]. This provides a reasonable description of 
the joint cumulants in the nonlinear regime, but not as accurate as EPT for 
one-point cumulants [153]. However, a first application suggests that this is in 
disagreement with observations [623]. 
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Both EPT and E 2 PT provide useful ways of describing higher-order statistics 
as functions of a single parameter n e g and can be used for estimating cosmic 
errors on statistics measured in galaxy catalogs as discussed in the next chap¬ 
ter. However, except in the weakly nonlinear regime, these prescriptions lack 
any rigorous theoretical background, although some elements towards their 
justification can be found in HEPT (see Sect. 4.5.6). 
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6 From Theory to Observations: Estimators and Errors 

6.1 Introduction 

This chapter focuses on issues regarding accurate estimation of clustering 
statistics in large-scale galaxy surveys and their uncertainties, in order to prop¬ 
erly constraint theories against observations. We also consider applications to 
measurements in TV-body simulations, as briefly described in Sect. 6.12. 

In many respects, the theory of estimators of large scale structure statistics 
was triggered in the seventies and the early eighties by Peebles and his collab¬ 
orators. In a series of seminal works, starting with a fundamental paper [500], 
these authors developed the statistical theory of the two-point correlation 
function in real and Fourier space, in two- and three-dimensional catalogs, 
including estimates of the cosmic errors and the cosmic bias (formulated as 
an integral constraint problem), followed soon by investigations on higher- 
order statistics. They used several estimators, including count-in-cell statistics. 
These results are summarized in [508]. 

Since then, and particularly in the nineties, a number of techniques were put 
forward to allow a more precise testing of cosmological theories against obser¬ 
vations. These include: 

- Detailed studies of two-point and higher-order correlation functions esti¬ 
mators. 

- Accurate estimation of errors going beyond the simple (and often severe 
underestimate) Poisson error bars, to include finite-volume effects, survey 
geometry and non-Gaussian contributions clue to non-linear evolution. 

- The treatment of covariance between measurements at different scales. In 
order to properly test theoretical predictions, this is equally important 
to an accurate treatment of errors, which are just the diagonal elements 
of the covariance matrix. Neglecting off-diagonal elements can lead to a 
substantial overestimate of the constraining power of observations (see 
e.g. Chapter 8). 

- Implementation of techniques for data compression, error decorrelation, 
and likelihood analysis for cosmological parameters estimation. 

It is clear that the upcoming large-scale galaxy surveys such as 2dFGRS and 
SDSS will certainly have to rely heavily on these new developments to extract 
all the information encoded by galaxy clustering to constrain cosmological 
parameters, primordial non-Gaussianity and galaxy formation models. In ad¬ 
dition to standard second-order statistics such as the power-spectrum or the 
two-point correlation function, our review focuses on higher-order statistics 
for several reasons: 

- As detailed in previous chapters, non-linear evolution leads to deviations 
from Gaussianity, so two-point statistics are not enough to characterize 
large-scale structure. They do not contain all the information available 
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to constrain cosmological theoriesp 5 ]. 

- The additional information encoded by higher-order statistics can be 
used, for example, to constrain galaxy biasing (Sect. 7.1), primordial 
non-Gaussianity (Sects. 4.4 and 5.6) and break degeneracies present in 
measurements of two-point statistics, e.g. those obtained from measure¬ 
ments of the redshift-space power spectrum (Sect. 7.4). PT provides a 
framework for accomplishing thisp 6 ]. 

- The significant improvement in accuracy for higher-order statistics mea¬ 
surements expected in upcoming large scale surveys, see e.g. Fig 40 below. 

Needless is to say that measurements in galaxy catalogs are subject to a num¬ 
ber of statistical and systematic uncertainties, that must be properly addressed 
before comparing to theoretical predictions, succinctly: 

(i) Instrumental biases and obscuration: there are technical limitations due 
to the telescopes and the instruments attached to it. For example, in spec¬ 
troscopic surveys using multifiber devices such as the SDSS, close pairs 
of galaxies are not perfectly sampled unless several passes of the same 
part of the sky are done (e.g. see [74]). This can affect the measurement 
of clustering statistics, in particular higher-order correlations. Also, the 
sky is contaminated by sources (such as stars), dust extinction from our 
galaxy, etc... 

(ii) Dynamical biases and segregation: unfortunately it is not always possible 
to measure directly quantities of dynamical interest: in three-dimensional 
catalogs, the estimated object positions are contaminated by peculiar ve¬ 
locities of galaxies. In 2-D catalogs, the effects of projection of the galaxy 
distribution along the line of sight must be taken into account. Further¬ 
more, galaxy catalogs sample the visible matter, whose distribution is 
in principle different from that of the matter. The resulting galaxy bias 
might depend on environment, galaxy type and brightness. Objects se¬ 
lected at different distances from the observer do not necessarily have the 
same properties: e.g. in magnitude-limited catalogs, the deeper objects 
are intrinsically brighter. One consequence in that case is that the num¬ 
ber density of galaxies decreases with distance and thus corrections for 
this are required unless using volume-limited catalogs. 

(iii) Statistical biases and errors: the finite nature of the sample induces un¬ 
certainties and systematic effects on the measurements, denoted below 
as cosmic bias and cosmic error. These cannot be avoided (although it is 
possible to estimate corrections in some cases), only reduced by increasing 


55 For example, although one could construct a matter linear power spectrum that 
evolves non-linearly into the observed galaxy power spectrum (see Fig. 51); it is 
not possible to match at the same time the higher-order correlations at small scales 
(see Fig. 54). This implies non-trivial galaxy biasing in the non-linear regime, as we 
discuss in detail in Sects. 8.2.4-8.2.5. 

56 A quantitative estimate of how much information is added by considering higher- 
order statistics is presented in [645]. 
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the size of the catalog and optimizing its geometry. 

In this chapter, we concentrate mainly on the point (iii). Dynamical biases 
mentioned in point (ii) will be addressed in the next chapter. These effects 
can also be taken into account in the formalism, by simply replacing the values 
of the statistics intervening in the equations giving cosmic errors and cross¬ 
correlations with the “distorted” ones, as we shall implicitly assume in the rest 
of this chapter^. Segregation effects and incompleteness due to instrument 
biases, obscuration or to selection in magnitude will be partly discussed here 
through weighted estimators, and in Chapter 8 when relevant. 

This chapter is organized as follows. In Sect. 6.2, we discuss the basic con¬ 
cepts of cosmic bias, cosmic error and the covariance matrix. Before entering 
in technical details, it is important to discuss the fundamental assumptions 
implicit in any measurement in a galaxy catalog, namely the fair sample hy¬ 
pothesis [500] and the local Poisson approximation. This is done in Sect. 6.3, 
where basic concepts on count-in-cell statistics and discreteness effects correc¬ 
tions are introduced to illustrate the ideas. In Sect. 6.4, we study the most 
widely used statistic, the two-point correlation function, with particular at¬ 
tention to the Landy and Szalay estimator [393] introduced in Sect. 6.4.1. 
The corresponding cosmic errors and biases are given and discussed in several 
regimes. Section 6.5 is similar to Sect. 6.4, but treats the Fourier counterpart 
of £, the power spectrum. Generalization to higher-order statistics is discussed 
in Sect. 6.6. 

Section 6.7 focuses on the count-in-cell distribution function, which probes 
the density field smoothed with a top-hat window. In that case a full analytic 
theory for estimators and corresponding cosmic errors and biases is available. 
Section 6.8 discusses multivariate counts-in-cells statistics. In Sect. 6.9 we in¬ 
troduce the notion of optimal weighting: each galaxy or fraction of space can 
be given a specific statistical weight chosen to minimize the cosmic error. Sec¬ 
tion 6.10 deals with cross-correlations and the shape of the cosmic distribution 
function and discusses the validity of the Gaussian approximation, useful for 
maximum likelihood analysis. Section 6.11 reinvestigates the search for op¬ 
timal estimators in a general framework in order to give account of recent 
developments. In particular, error decorrelation and the discrete Karhunen- 


57 Of course, this step can be non trivial. Measurements in galaxy catalogs (Sect. 8) 
and in IV-body simulations suggest that in the nonlinear regime the hierarchical 
model is generally a good approximation (e.g. [87,234,147,150,472]), but it can 
fail to describe fine statistical properties (e.g. for the power spectrum covariance 
matrix [564,296]). In the weakly nonlinear regime, PT results including redshift 
distortions (Sect. 7.4), projection along the line of sight (Sect. 7.2) and biasing 
(Sect. 7.1) can help to compute the quantities determining cosmic errors, biases 
and cross-correlations. In addition to the hierarchical model, extensions of PT to 
the nonlinear regime, such as EPT, E 2 PT (Sect. 5.13) and HEPT (Sect. 4.5.6), 
coupled with a realistic description of galaxy biasing can be used to estimate the 
errors. 
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Loeve transforms are discussed. Finally, Sect. 6.12 discusses the particular case 
of measurements in iV-body simulations. 

In what follows, we assume we have a D-dimensional galaxy catalog D of 
volume V and containing N g objects, with N g 1, corresponding to an 
average number density h g = N„/V. Similarly we dehne a pure random catalog 
R of same geometry and same number of objects^]. Despite the fact that we 
use three-dimensional notations (T> = 3) most of results below are valid as 
well for angular surveys except when specified otherwise. Simply, £(r) has to 
be replaced with w(9), Qn with q^, etc. 

6.2 Basic Concepts 

6.2.1 Cosmic Bias and Cosmic Error 

In order to proceed we need to introduce some new notation. If A is a statistic, 
its estimator will be designated by A. The probability Y(A) of measuring 
the value A in a galaxy catalog (given a theory) will be called the cosmic 
distribution function. The ensemble average of A (the average over a large 
number of virtual realizations of the galaxy catalog) is 

(i) = J dA T(i). (362) 

Due to their nonlinear nature many estimators (such as ratios) are biased, i.e. 
their ensemble average is not equal to the real value A: the cosmic bias (to 
distinguish it from the bias between the galaxy distribution and the matter 
distribution), 

i, -d)-A 
bA -—^ 

does not vanish, except when the size of the catalog becomes infinite (if the 
estimator is properly normalized). 

A good estimator should have minimum cosmic bias. It should as well minimize 
the cosmic error, which is usually obtained by calculating the variance of the 
function T: 

(AA) 2 = ((hA) 2 > = f (hi) 2 T(i) di, 

with 

5A = A-{A). (365) 


58 Note that R stands as well for a smoothing scale, but the meaning of R will be 
easily determined by the context. 


(364) 



132 




The cosmic error is most useful when the function Y(A) is Gaussian. If this is 
not the case, full knowledge of the shape of the cosmic distribution function, 
including its skewness, is necessary to interpret correctly the measurement sp 9 ]. 

6.2.2 The Covariance Matrix 

As for correlation functions, a simple generalization of the concept of variance 
is that of covariance between two different quantities; this can be for example 
between two estimators A and B 

Cov(i, B) = (6A SB) = I hi SB T(i, B) dAdB, (366) 

or simply between estimates of the same quantity at different scales; say, for 
the power spectrum, the covariance matrix between estimates of the power at 
ki and kj reads, 

Clj = {P(ki)P{kj)) - <P(^)><m)>, (367) 

where P{kf) is the estimator of the power spectrum at a band power centered 
about ki . 

In general, testing theoretical predictions against observations requires knowl¬ 
edge of the joint covariance matrix for all the estimators (e.g. power spectrum, 
bispectrum) at all scales considered. We will consider some examples below in 
Sects. 6.4.4, 6.5.4 and 6.10.2. 

The cosmic error and the cosmic bias can be roughly separated in three con¬ 
tributions [621] if the scale R (or separation) considered is small enough com¬ 
pared to the typical survey size L, or equivalently, if the volume v = vr = 
(4/3)7 tR 3 is small compared to the survey volume, V: 

(i) Finite volume effects: they are due to the fact that we can have access to 
only a finite number of structures of a given size in surveys (whether they 
are 2-D or 3-D surveys), in particular the mean density itself is not always 
well determined. These effects are roughly proportional to the average of 
the two point correlation function over the survey, £(L). They are usually 
designated by “cosmic variance”. 

(ii) Edge effects: they are related to the geometry of the catalog. In general, 
estimators give less weight to galaxies near the edge than those far away 
from the boundaries. As we shall see later, edge effects can be partly cor¬ 
rected for, at least for Appoint correlation functions. At leading order in 
v/V, they are proportional to roughly ^v/V. Note that even 2-D surveys 
cannot avoid edge effects because of the need to mask out portions of 
the sky due to galaxy obscuration, bright stars, etc... Edge effects vanish 
only for A r -body simulations with periodic boundary conditions. 

59 For example, it could be very desirable to impose in this case that a good estimator 
should have minimum skewness [610]. 
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(iii) Discreteness effects: one usually assumes that the observed galaxy distri¬ 
bution is a discrete, local Poisson representation of an underlying smooth 
held whose statistical properties one wants to extract. This discrete na¬ 
ture has to be taken into account with appropriate corrections, not only 
to the mean of a given statistic but also to the error. Discreteness errors, 
which are proportional to l/jV g at some power where N g is the number 
of objects in the catalog, become negligible for large enough N g . 

The above separation into three contributions is convenient but somewhat 
artificial, since all the effects are correlated with each other. For example, 
there are edge-discreteness effects and edge-hnite-volume effects [624], At next 
to leading order in R/L, there is a supplementary edge effect contribution 
proportional to the perimeter of the survey, which is most important when the 
geometry of the survey is complex, and dominant when R/L zz 1 [537,154], 


6.3 Fair Sample Hypothesis and Local Poisson Approximation 
6.3.1 The Fair Sample Hypothesis 

A stochastic held is called ergodic if all information about its multi-point prob¬ 
ability distributions (or its moments) can be obtained from a single realization 
of the held. For example, Gaussian helds with continuous power spectrum are 
ergodic [3]. 

The Fair Sample Hypothesis [500] states that the finite part of the universe ac¬ 
cessible to observations is a fair sample of the whole, which is represented by a 
statistically homogeneous and isotropic (as defined in Sect. 3.2.1) ergodic held. 
Together with the ergodic assumption, the fair sample hypothesis states that 
well separated parts of the (observable) Universe are independent realizations 
of the same physical process and that there are enough of such independent 
samples to obtain all the information about its probability distributions (e.g. 
[508,61]). Under the fair sample hypothesis, ensemble averages can be replaced 
with spatial averages. In the simplest inflationary models leading to Gaussian 
primordial fluctuations, the fair sample hypothesis holds, but special cases can 
be encountered in models of Universe with non-trivial global topological prop¬ 
erties (see e.g. [389]) where apparently well separated parts of the Universe 
may be identical. 


6.3.2 Poisson Realization of a Continuous Field 

In general, statistical properties of the density held are measured in a discrete 
set of points, composed e.g. of galaxies or IV-body particles. It is natural to 
assume that such point distributions result from a Poisson realization of an 
underlying continuous held. This means that the probability of finding N 
points in a volume v at location r is given by P^ olsson [n g v(l + <5(r))], where 
pPoisson(jy) j g the probability of finding N objects in a Poisson process with 
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expectation number N = n g v, 


N n 

DPoisson/'A^^ — N 

N ^ ] ~ip ’ 


( 368 ) 


<5(r) is the overall density contrast within the volume and n g is the average 
number density of the random process. It implies that the count probability 
distribution function, hereafter CPDF, defined as the probability P/v of finding 
N galaxies in a cell of size R and volume v thrown at random in the catalog 
can be expressed through the convolution, 


+00 


Pn = 


d Sp(S)P i 


Poisson 

N 


N( 1 + 5) 


-i 


(369) 


where the average number of objects per cells, N, reads 


n = J2np n . 

N 


(370) 


In the continuous limit, N —> oo, the CPDF of course tends to the PDF of 
the underlying density field 


Pi 


N 


P[N(l + 5)] 
N 


(371) 


It is worth at this point to mention the void probability function, P 0 , which 
can be defined in discrete samples only. From Eqs. (369) and (368), it reads 

+00 

P Q = J d6p(6)exp[-N(l + 6)], (372) 

-l 


which can be expressed in terms of the cumulant generating function [687,16,619] 
(see Sect. 3.3), 


Pn = 


exp 


-N + C(—N) 


= exp 


OO 

E 

Ln= 1 


(-JV)' 


n! 


(Oc 


(373) 


This property was used in practice to obtain directly the cumulant generating 
function from the void probability function (e.g., [445,205,92]), relying on the 
local Poisson approximation. 

Obviously, the validity of the local Poisson approximation is questionable. A 
simple argument against it is that galaxies have an extended size which defines 
zones of mutual exclusion and suggests that at very small scales, galaxies do 
not follow a local Poisson process because they must be anti-correlated. One 
way to bypass this problem is of course to choose the elementary volume such 
that it has a sufficiently large size, say £ > a few tens of kpc. One might still 
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argue that short-range physical processes depending on environment might 
influence small-scale statistics in such a way that it might be impossible to 
find a reasonably small scale l for which the Poisson process is valid. Also, the 
galaxy distribution might keep memory of initial fluctuations of the density 
field, even at small, nonlinear scales, particularly in underdense regions which 
do not experience shell-crossing and violent relaxation. If for example these 
initial conditions were locally fractal up to some very small scale, obviously 
the local Poisson approximation would break down. Note on the other hand 
that sparse sampling strategies [361] which were used to build a number of 
galaxy catalogs, make the samples “closer” to Poisson. 

It is generally assumed that the observed galaxy distribution follows the local 
Poisson approximation. To our knowledge there exists no direct rigorous check 
of the validity of this statement, but it is supported indirectly, for example 
by the fact that the measured count probability distribution function (CPDF, 
see Sect 6.7) in galaxy catalogs compares well with models relying on the local 
Poisson approximation (see, e.g. [92]). 

In IV-body simulations, the local Poisson assumption is in general very good 60 . 
However this depends on the statistic considered and there are some require¬ 
ments on the degree of evolution of the system into the nonlinear regime, as 
discussed below in Sect. 6.12.2. 

Under the assumption of local Poisson approximation, it is possible to derive 
the correlation functions of the discrete realization in terms of the underlying 
continuous one. In particular, from Eq. (369) the moment generating function 
of the discrete realization, Addisc, is related to that of the continuous field, 
Ad (Sect. 3.3.3), by Addi SC (t) = Ad(t) [exp(t) — 1], This leads to the stan¬ 
dard expressions for moments and spectra of discrete realizations in terms of 
continuous ones, e.g. see [396,508,233,619,247,434], Here we give the first few 
low-order moments 

(<© = 4+?2, (374) 

<£> = A + 3 | + U, (375) 

where 8 n = (N — N)/N denotes the discrete number density contrast. In 
Sect. 6.7, which discusses in more detail count-in-cells statistics, we shall see 
that there exists an elegant way of correcting for discreteness effects using 
factorial moments. 

Similarly, for the power spectrum and bispectrum, 

(<$ n (ki)<$ n (k 2 ))= ^- + P{ki) <y„(ki 2 ), (376) 

_ g 

60 Except when dealing with the clustering of dark matter halos; in this case exclu¬ 
sion effects can lead to sub-Poisson sampling, see e.g. [599]. 
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<5n(k 1 )<5„(k 2 )«5 n (k 3 )) = 


Jf2 + + P2 + P3 ) + 5123 

§ § 


a n (ki 23 ), (377) 


where P t = P(/q), B 123 = B(k 1 , k 2 , k 3 ), k 12 = ki+k 2 and k m = k! + k 2 + k 3 


6.4 The Two-Point Correlation Function 

In this section, we present the traditional estimators of the two-point corre¬ 
lation function based on pairs countin g^ . We assume that the catalog under 
consideration is statistically homogeneous. Optimal weighting and correction 
for selection effects will be treated in Sect. 6.9. More elaborate estimates taking 
into account cross-correlations between bins will be discussed in Sect. 6.10. 

6 . 4 .I Estimators 

In practice, due to the discrete nature of the studied sample, the function £ 
[Eq. (115)] is not measured at separation exactly equal to r but rather one 
must choose a bin, e.g., [r, r + Ar[. More generally, the quantity measured is 

—/ d c rid c r 2 0( ri ,r 2 ) £(n 2 ), (378) 

^ P ^OOt/ 

>'00 

where the function 0(ri,r 2 ) is symmetric in its arguments (e.g. [624]). In 
what follows, we assume that the function 0 is invariant under translations 
and rotations, 0(r!,r 2 ) = 0(r), r = r 12 = [iq — r 2 |, is unity on a domain of 
values of r, for example in the interval [r, r + Ar[ and vanishes otherwise. The 
values where 0 is non-zero define a “bin” which we call 0 as well. We assume 
that £(r) is sufficiently smooth and that the bin and the normalization, 
are such that Eq. (378) would reduce with a good accuracy to £(r) in a survey 
of very large volume V,' x . 

Practical calculation of the two-point correlation function relies on the fact 
that it can be defined in terms of the excess probability over random 8P 
of finding two galaxies separated by a distance (or an angle) r [as discussed 
already in Chapter 3, Eq. (127)] 

6P = nl[l + £{r)]6V 1 6V 2 , (379) 

where 8V 1 and SV 2 are volume (surface) elements and n g is the average number 
density of objects. 

Let DD be the number of pairs of galaxies in the galaxy catalog belonging 
to the bin 0 and RR defined likewise but in a random (Poisson distributed) 


61 For a review on existing estimators, see, e.g. [372,525]. 
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catalog with same geometry and same number of objects, N r = N g . They read, 


DD = j d x, rid x, r 2 0 (ri,r 2 ) n g (ri) n g (r 2 ), ( 380 ) 

ri^r2 

RR= J d c rid c r 2 0 (r!,r 2 ) n r (r!) n r (r 2 ), ( 381 ) 

ri^r2 


where n g and n T are local number density helds respectively in the galaxy 
catalog and the random catalog: 

n s = ( 382 ) 

3 = 1 


where Xj are the galaxy positions and likewise for n r . It is easy to derive from 
Eq. (379) a simple estimator commonly used in the literature [503]: 


£(0 


DD 

~RR 


1 . 


(383) 


Various alternatives have been proposed to improve the estimator given by 
Eq. (383), in particular to reduce the cosmic bias induced by edge effects at 
large separations. Detailed studies [373] suggest that the best of them is the 
Landy & Szalay (LS) estimator [393][ B? | 



DD - 2 DR + RR 
RR 


(384) 


where DR is the number of pairs selected as previously but the first object 
belongs to the galaxy sample and the second one to the random sample 


DR = J d r ’rid I, r 2 @(ri,r 2 ) n g (ri) n r (r 2 ). 

ri^r2 


(385) 


The LS estimator, which formally can be written ( D\ — i?i)(D 2 — i? 2 )/i?ii? 2 
corresponds to the “intuitive” procedure of first calculating overdensities and 
then expectation values; this has the obvious generalization to higher-order 
correlation functions [624], see Sect. 6.6 for more details. 

Note that the calculations of DR and RR can be arbitrarily improved by arbi¬ 
trarily increasing N r and applying the appropriate corrections to DR and RR, 
i.e. multiplying DR and RR by the ratio N g /N r and N g (N g —1)/[N T (N T — 1)] re¬ 
spectively, to preserve normalization. Actually, DR and RR can be computed 
numerically as integrals with a different method than generating a random 

62 See however [525] for a more reserved point of view. 
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catalog, the latter being equivalent to Monte-Carlo simulation. It amounts to 
replace DR and RR by DF and FF with, 


DF — rig J cPrid :D r 2 @(ri,r 2 ) n g (ri), 

ri^r 2 


(386) 


FF = n g J d x ’rid 23 r 2 0(ri,r 2 ). 

ri^r 2 


(387) 


In that case, the actual measurements are performed on pixelized data. 

The LS estimator is theoretically optimal with respect to both cosmic bias 
and cosmic error at least in the weak correlation limit [393]; numerical stud¬ 
ies [373] show moreover that for practical purposes it is better than any other 
known estimators based on pair counting, among those one can quote (DD — 
DR)/RR [311], the popular DD/DR — 1 [172,68] and DDRR/(DR) 2 — 1 [291] 
which is actually almost as good as LS [373]. In Sect. 6.8 we shall mention 
other ways of measuring £(r) and higher-order correlation functions, based on 
multiple counts-in-cells. 

Finally, it is worth mentioning a few efficient methods used to measure £(r), 
which apply to any of the estimators discussed in this paragraph. The brute 
force approach is indeed rather slow, since it scales typically as 0(N 2 ). To 
improve the speed of the calculation, one often interpolates the sample onto a 
grid and creates a linked list where each object points to a neighbor belonging 
to the same grid site. For separations smaller than the grid step, A, this method 
scales roughly as 0(N g N ce n), where 7V cell is the typical number of objects 
per grid cell. This approach is however limited by the the step of the grid: 
measuring the correlation function at scales large compared to A is rather 
inefficient and can become prohibitive. Increasing A makes N ce \\ larger and for 
too large A, the method is slow again. 

Another scheme relies on a double walk in a quad-tree or a oct-tree accord¬ 
ing to the dimension of the survey (a hierarchical decomposition of space in 
cubes/squares and subcubes/subsquares, [461]). This approach is potentially 
powerful, since it scales as 0(N^ 2 ) according to its authors [461]. It is also 
possible to rely on FFT’s or fast harmonic transforms at large scales [636], but 
it requires appropriate treatment of the Fourier coefficients to make sure that 
the quantity finally measured corresponds to the estimator of interest, e.g. the 
LS estimator (see [636] for a practical implementation in harmonic space). 


6.J h 2 Cosmic Bias and Integral Constraint of the LS Estimator 
The full calculation of the cosmic bias and the cosmic error of the LS estima¬ 
tor was done by Landy & Szalay [393] in the weak correlation limit and by 
Bernstein [59] for the general case but neglecting edge effects, r <C L , where 
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L is the smallest size of the survey^]. At leading order in r/L and assuming 
that the density variance at the scale of the survey is small, the cosmic bias 
reads 


^ 3-iW)_ 2 |_ 


2«f’ 


r/L, |f(L)|, 


£(£) 


« 1, 


(388) 


where 

f(£) = ^2 j dVr i dVr 2 £0) ( 389 ) 

is the average of the correlation function over the survey volume (or area). 
The quantity £3 is defined as 

£3 = J d^iqd^rsd^rg 0(r 12 ) f 3 (ri,r 2 ,r 3 ), (390) 

where G p is the form factor defined in [393] as 

G p = ^2 J d^rid 2 ^ 0(ri 2 ), (391) 


i.e. the probability of hnding a pair included in the survey in bin 0. When r/L 
is small enough it is simply given by G v ~ Anr 2 Ar/V (for a bin 0 = [r, r+Ar [). 

Assuming the hierarchical model, Eq. (214), we get £3 ~ 2Q^S,i{L) and the 
cosmic bias simplifies to 


k- (3 - 4«3 - 1) f(i) - -L, r/L, |f(L)|, 


«i) 


< 1. 


(392) 


In the weak correlation limit, it simply reduces to [393] 


b 


t ~ 


e 


ia i^)i«i- 


(393) 


The LS estimator, although designed to minimize both the cosmic error and 
the cosmic bias and thus quite insensitive to edge effects and discreteness 
effects, is still affected by finite-'volume effects, proportional to £(L) (indeed the 
latter cannot be reduced without prior assumptions about clustering at scales 
larger than those probed by the survey, as discussed below). The corresponding 
cosmic bias is negative, of small amplitude in the highly nonlinear regime, but 
becomes significant when the separation r becomes comparable to the survey 

63 It is however important to notice a subtle difference between the two approaches: 
Landy & Szalay use conditional averages with fixed number of galaxies in the catalog 
Ng, while Ng is kept random in Bernstein’s approach. This difference is analyzed in 
Sect. 6.10. 
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size. In this regime, where £(r) is expected to be much smaller than unity, 
Eq. (393) is generally valid: the correct value of £ is obtained by adding an 
unknown constant to the measured value. This corresponds to the so called 
integral constraint problem [502,508]. Physically, it arises in a finite survey 
because one is estimating the mean density and fluctuations about it from 
the same sample, and thus the fluctuation must vanish at the survey scale. In 
other words, one cannot estimate correlations at the survey scale since there 
is only one sample available of that size. 

This bias cannot be a priori corrected for unless a priori assumptions are 
made on the shape of the two-point correlation function at scales larger than 
those probed by the survey. One can for instance decide to model the two-point 
correlation as a power-law and do a joint determination of all parameters [502], 
We will come back to this problem when discussing the case of the power 
spectrum, where other corrections have been suggested, see Sect. 6.5.2. 


6-4-3 Cosmic Error of the LS Estimator 

The general computation of the cosmic error for such estimator is quite in¬ 
volved and has been derived in the literature in various cases. For instance, the 
covariance of DD—2DF+FF between two bins @ a and ©^ reads [500,291,634] 


Cov(DD — 2DF + FF) = n A g J d v r 1 d D r 2 d D r 3 d D r 4 © a (ri,r 2 ) @&(r 3 ,r 4 ) x 
[£ 4 ( r i, r 2 j r 3 , r 4 ) + £(r 4 , r 3 )£(r 2 , r 4 ) + £(r 4 , r 4 )£(r 2 , r 3 )] 

+4hg J d D r 1 d D r 2 d D r 3 0 a (ri,r 2 ) 0 6 (ri,r 3 ) [£(r 2 ,r 3 ) + £ 3 (r 4 , r 2 , r 3 )] 

+2hg J d p ricPr 2 0 a (r 4 , r 2 ) 0 6 (r 4 , r 2 ) [1 + £(r 4 , r 2 )] . (394) 

This is a general expression, i.e. it applies to the two-point correlation function 
as well as the power-spectrum, or any pairwise statistics of the density field, 
depending on the choice of the binning function 0. It does not take however 
into account the possible cosmic fluctuations of the denominator in the LS 
estimator. This latter effect is more cumbersome to compute because one has 
to deal with moments of the inverse density. This is possible if one assumes 
that fluctuations are small. This leads to the cosmic covariance derived in [59] 
for the LS estimator. We give here a simplified expression of the diagonal term, 
the cosmic error: 


(£ 
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"N* 


2 >-0 


— 2 + 4(1 — 2 Q 3 + Qf)£(L) + 

g p e a 


iW 


£ring(l + 2Q3O 

£ 2 


+ Q3 — 1 


r / L, |£(L)|, |£(L)/£|« 1, (395) 


where £ 2 is the average of the square of the two-point correlation function over 
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the survey volume, 


£ 2 = Q^v4 J dVri ■'' dI,r4 0 ( ri2 ) 0 ( r34 ) ^(ri 3 ) 6(r24), (396) 

and £ r i ng is the average of the two-point correlation function for pairs inside 
the shell of radius r and thickness A r 

£rin g = ^ 7 ^ J d v r 1 d v r 2 d v r z 0 (ri 2 ) 0 (ri 3 ) £(r 23 ). (397) 

We have introduced the new geometrical factor G t given by [393] 

G t = ^ J d D r 1 d v r 2 d D r 3 0 (r i2 ) 0 (r 13 ), (398) 

i.e. G t is the probability, given one point, of hireling two others in bin 0 , for 
example the interval [r, r + Ar[. As pointed out in [59], £ rin g > £, but 

£ring ^ £ (399) 

is a good approximation. In Eq. (395), a degenerate hierarchical model (Sect. 4.5.5) 
has been assumed to simplify the results. A more general expression can be 
found in [59] (See also [291,634].). 

The finite volume errors are given by a term in £ 2 and one proportional to £(L). 

It is interesting to compare these two contributions. For a power-law spectrum 
of index n, £ 2 /£ 2 scales like ( r/L) v whereas £(L) scales like (r 0 / L)~^ v+n ^ if 
r 0 is the correlation length (£(ro) = 1). Therefore in the quasi-linear regime 
for which r ro and for surveys with a large number of objects, the Erst 
term is likely to dominate (this is the case typically for wide angular surveys), 
whereas for surveys which probe deeply into the nonlinear regime the other 
terms are more likely to dominate. 

The discreteness error is given by the term in 1 /N g which vanishes for a 
randomized purely Poisson catalog. The intrinsic Poisson error is encoded in 
the term in (1/A g ) 2 . This estimate of the cosmic error neglects however edge 
effects that become significant at scales comparable to the size of the survey. 

In this latter regime correlations are expected to be weak, and from [393] one 
Ends that the cosmic error is dominated by edge-discreteness eflects [624]: 

Ifl, l«£)l « 1 . (400) 

One can note that when r/L is small enough, the term in square brackets is 
roughly equal to 1 /G p [as in Eq. (395)], that is the fraction of pairs available 
in the survey. This is obviously the dominant contribution of the error when 
the bin size A r is very small. This pure Poisson contribution can generally be 
computed exactly given the geometry of the survey. 


'A£ n 


Ke 


—— 2 — + 1 
n n2 1 

'- t p '- T p 
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The expressions (395) and (400) can be used to estimate the full cosmic error. 
This method however requires prior assumptions about the hierarchical model 
parameters Q 3 and Q 4 and for the integral of the two-point correlation function 
over the survey volume, £(L). For this reason, the Gaussian limit is often used 
to compute errors (that is the contribution of £ 2 , e.g. [410]), but this might 
be a bad approximation when £ > 1 as we discussed abovep 7 ]. 

6-4-4 The Covariance Matrix 

As discussed above, Eq. (394) gives the cosmic covariance matrix of the two- 
point correlation function assuming that n g is perfectly determined, while the 
calculation of Bernstein [59], for which we gave a simplified expression of the 
diagonal terms, takes into account possible fluctuations in n g . We refer the 
reader to [59] for the full expression of which is rather cumbersome. 
Interestingly the pure Poisson contribution vanishes for non-overlapping bins 
in Eq. (394). A simplified formula can be obtained in the Gaussian limit where 
non-Gaussian and discreteness contributions can be neglected, 

Ct(r a ,r b ) = (6(r a )6(r fe )) - (f 2 (r a ))(f 2 (r 6 )) 

= r ( \r ( m/4 / ^ Vl ''' d ^ r4 0 “( ri2 ) 0 fe( r 34 ) 6 (w3) 6(^24), 
Gp(r a )G p (r b ) V 4 J 

(401) 

in particular, G^(r, r) = £ 2 [Eq. (396)]. This expression can be conveniently 
expressed in terms of the power spectrum. It reads, for T> = 3, 

Ct(r a ,n) = ^ J k 2 dk[P(k)] 2 J 1/2 (kr a )J 1/2 (kr b ) (402) 

where J 1/2 is a Bessel function. A similar expression has been derived for 2-D 
fields [204], 

C w (9 a ,9 b ) = (w 2 {9 0 )w 2 (9 b )) - ( w 2 (9 a ))(w 2 (9 b )) 

kdk [P(k)] 2 J 0 (k9a) J 0 (k9 b ), (403) 

where is the area of the survey, w 2 (^) represents the angular two-point 
function and u) 2 its estimator. 

Note that as the volume/area of the survey increases, the diagonal terms in 
Eq. (401) do not, in general, become dominant compared to the off-diagonal 
ones. This is because correlation function measurements are statistically cor¬ 
related, even in the Gaussian limit, unlike binned power spectrum measure¬ 
ments, e.g. see Sect. 6.5.4. 

64 Figure 38 below, extracted from [564], illustrates that for the power-spectrum. 
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6-4-5 Recipes for Error Calculations 

The issue of cosmic error computation is recurrent in cosmological surveys and 
the previous computations clearly show that this is a complex issue. Various 
recipes have been proposed in the literature. A particularly popular one is 
the bootstrap method [24], We stress that bootstrap resampling is not suited 
for correlation function measurements. Indeed, as shown explicitly in [597], 
such method does not lead, in general, to a reliable estimate of the cosmic 
error [525,373], 

Another popular and elementary way of estimating the errors consists in di¬ 
viding the catalog in a number of smaller subsamples of same volume and 
compute the dispersion in the measurements corresponding to each subsam¬ 
ple (e.g. [249]). This method is not free of bias and generally overestimates 
the errors, since the obtained dispersion is an estimator of the cosmic error 
on the subsamples and not the parent catalog. Recent studies on error esti¬ 
mation [572,704] also suggest that the Jackknife method, which is a variant 
of the subsample method where the i th sample is obtained by removing the 
i th subsample, gives a very good estimate of the cosmic error on the two- 
point correlation function. Unlike the subsample method, it does not lead to 
overestimation of the cosmic error at large scalesp 5 ]. 

Of course, methods such as Jackknife and subsamples cannot lead to an ac¬ 
curate estimation of finite-volume errors at the scale of the survey, since only 
one realization of such a volume is available to the observer. This can only 
be achieved through a detailed computation of the cosmic errors [Eqs. (395) 
and (400)] with prior assumptions about the behavior of statistics involved at 
scales comparable to the survey size; or else numerically by constructing multi¬ 
ple realizations of the survey, e.g. mock catalogs relying on Af-body simulations 
or simplified versions thereof (e.g., [571]). On the other hand, methods that 
use the actual data are very useful to assess systematic errors, by comparing 
to other external estimates such as those just mentioned. 

6.5 The Power Spectrum 

The power spectrum P(k) is simply the Fourier transform of the two-point 
correlation function (see Sect. 3.2.2), and therefore it is formally subject to the 
same effects. In fact, a common theoretical framework can be set up for £(r) 
and P(k) in order to find the best estimators (e.g., [293,294,624]). In practice, 
however, power spectrum measurements have been undertaken mostly in lin¬ 
ear or weakly nonlinear scales which are subject to edge effects, difficult to 
correct for. In this section, we introduce simple (unweighted) estimators and 
discuss the biases and cosmic error introduced by the finiteness of the survey. 


65 An alternative to these methods has been suggested by Hamilton [291], in which 
many realizations from a given sample are generated by effectively varying the pair¬ 
weighting function. 
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The techniques developed to measure P(k) are numerous and sometimes very 
elaborate (a nice review can be found in [648]), but most of them rely on the 
assumption that the underlying statistics is Gaussian. In this section we prefer 
to keep the statistical framework general and thus restrict ourselves to tra¬ 
ditional estimators. More sophisticated methods, using spatial weighting and 
cross-correlations between bins, will be discussed in Sects. 6.9 and Sect. 6.11. 


6.5.1 Simple Estimators 

For convenience, in finite surveys the adopted normalization convention for 
the Fourier transforms and the power spectra is often different. This is the 
reason why in this subsection, we also adopt following convention, 

i(k) = y I d®x e- ik ' x A(x) (404) 

v 


where A(k) are the Fourier modes of A(x) and V is the survey volume (and 
to recover the convention used in Eq. (36), one can simply use the formal 

correspondence V < -> ( 2tt ) v .) The power spectrum is defined as the Fourier 

transform of the two-point correlation function. It differs thus by a V/(2n) v 
normalization factor compared to the adopted normalization in the other sec¬ 
tions. The higher-order spectra are defined similarly from the higher-order 
correlation functions in such a way that the functional relation between spec¬ 
tra is preserved [e.g., the coefficients Q in Eq. (154) are left unchanged]. 

As shown in previous sections, estimating the correlation function consists in 
counting pairs in bins, both in the galaxy catalog and in random realizations 
with the same survey geometry. This procedure can be generalized to the mea¬ 
surement of the power-spectrum (e.g., [212]) for which the binning function 0 
defined in Sect. 6.4.2 is now different. For one single mode the straightforward 
choice would be (e.g., [624]) ©(rq,^) = ^ e lk (ri_r2) + e lk ( r 2 -ri)^ Actual es¬ 
timation of the power is made over a fc-bin defined for instance so that the 
magnitude of wave vectors belong to a given interval [k,k + A k[. It means 
that the function 0 to use actually reads, 

0( ri ,r 2 ) = (e ik ' (ri ~ r2) ) 0 = [ d v k'e ik '< ri ~ r2 \ (405) 

14 1 

|k'|e[fc,fc+Afc[ 


where 14 is the volume of the bin in A;-space. Note that for a rectangular 
shaped survey with periodic boundaries modes are discrete and the number 
of modes in 14 is 


GI4 


{2n) v ' 


(406) 


In the following we assume that 14 is large enough to encompass a sufficient 
number of modes to make any measurement possible. With this expression of 
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0 the quantities DD, DR, RR, DF and FF defined in (380-387) where 0 is 
replaced by Eq. (405) can be used to estimate the power spectrum [624], 
Traditionally, the estimate of the power-spectrum is done in the following way: 
the density contrast is Fourier transformed directly (e.g. [500,492,679,215,489]): 




e ikx d© x 


N e 


N c 


J2 e ik Xj 

3 =1 


Ilk- 


(407) 


where is the Fourier transform of the window function of the survey, 


Wk = 



(408) 


The power spectrum estimator is then given by, 

P(k) = (|4l 2 >e--4 

iV g 


(409) 


where (...)© stands for summation in the 7-bin [e.g. Eq.(405)], which can also 
be written 


P(k) — (DD — 2 DF + FF). 


(410) 


Note that the correction for shot noise contribution is automatically taken 
into account by the exclusion iq ^ r 2 in the integral DD. One can see that 
this is analogous to the LS estimator (384) in Fourier space [624], 


6.5.2 Cosmic Bias and Integral Constraint 

Similarly as for the two-point correlation function, it is possible to show that 
the estimator in Eq. (410) is biased [500,492], at least due to finite volume 
effects. Again this is generally described as the integral constraint problem. 
The expressions for the cosmic bias can be directly inferred from Eqs. (388) and 
(393). More specifically, at large, weakly nonlinear scales, where the Gaussian 
limit is a good approximation, the cosmic bias reads [492] 


b P{k) 




-PM 


(l^k| 2 )e 

(PM)e' 


(411) 


The quantity P* is the true power spectrum convolved with the Fourier trans¬ 
form of the window function of the survey: 

P*(k) = P(k) * |hF k | 2 . (412) 


Note that P*(0) is nothing but £(L) [Eq. (389)]. 
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At smaller scales, in the regime k 1/L, the cosmic bias reads, 




(l^k| 2 )e 

(P( k))e 


2 (-B*(k, —k, O)) 0 
(P( k))e 


(413) 


where L* is the bispectrum (convolved with the Fourier transform of the survey 
window). 

In general the cosmic bias is approximated by the white noise value in the 
Gaussian limit [489] 

&#(*) - -(IWkDe = -FF/Nl (414) 

and the corresponding correction is applied to the estimator (410). 

An interesting approach to correct for the cosmic bias takes advantage of 
the Gaussian limit expression, Eq. (411). Since the bias is proportional to 
the Fourier transform of the window of the survey, construction of a tailored 
window such that HA = 0 for each mode k of interest makes Eq. (411) van¬ 
ish [215,648]. However, one must keep in mind that this procedure is approx¬ 
imate; even in the Gaussian limit there are higher-order corrections to the 
result in Eq. (411) which are not proportional to HA 66 ■ 


6.5.3 The Cosmic Error 

The calculation of the cosmic error on the power-spectrum is formally equiv¬ 
alent to that of the two-point correlation function. However, existing results 
assume that the average number density of galaxies in the universe is an ex¬ 
ternal parameter, i.e. the ensemble average ([5P(k)] 2 ) is calculated with N g 
fixed in Eq. (410). 

In the limit when k 1/L, where L is the smallest size of the survey, for the 
power spectrum equation (394) reads, 


A P(k) 


2 

—-h 

N k 

2 

+ AG 


T(k,k) 4 

Fwp + A 


N k [/=(fc)] : 


1 B(k,k) 
N k P(k ) + [P(k)Y 
P(k,k)' 

ww r 


with 


(415) 


T(h, kj) = (T(k,, -ki, k 2 , ~:k 2 ))e l( ,e, j 

- / it / ^(k 1 .-k 1 .k 2 .-k 2 )(416) 

|ki|e[fcj,fej+Afej[ \k 2 \e[kj,kj+Akj[ 3 

65 The cosmic bias in this expression comes in fact from the uncertainty in the 
mean density n g from the numerator in 5 = (n g — n g )/n g ; uncertainties from the 
denominator lead to additional contributions, see e.g. [328]. 


147 



(417) 

(418) 


B(h, kj) = (B( ki, k 2 , —ki - k 2 )) e . ,e», 

P(ki, kj) = -(P(ki + ^ 2 ) + “ k 2 ))0 fc , i 0^. 

This result assumes that the true power spectrum is sufficiently smooth and 
the bin in fc-space thin enough that (P(fc))© fc ~ P(k), (P(fc) 2 ) e , ^ [P(U] 2 . 
The continuous limit N g —> oo of Eq. (415) was computed in [564], and the 
Gaussian limit, B — T — 0 in [212], 

From the calculations of [564], one gets 

_ 232 

T{k,k) ^—[P{k)f (419) 

in the regime where PT applies, and 

T(k, k) ~ (8Q 4 ,a + 4Q 4 , 6 ) [P{k)f, (420) 

if the hierarchical model applies (Sect. 4.5.5) [564,296]. Similar calculations 
can be done to evaluate B{k, k) and P[k, k ). 

One must emphasize [452,564] again the fact that the Gaussian limit, tradi¬ 
tionally used to compute errors and optimal weighting (see Sect. 6.9), is invalid 
when k > fc nl , where k n \ is the transition scale to the nonlinear regime defined 
from the power spectrum, Ank^P(k n \) = 1. This is clearly illustrated by top 
panel of Figure 38. It compares the measured cosmic error obtained from the 
dispersion over 20 PM simulations of SCDM with the Gaussian limit [564], 
This shows that the Gaussian limit underestimates the cosmic error, increas¬ 
ingly with k/k n i. Note however that the correction brought by Eq. (419) is 
rather small. As a result the regime where the Gaussian limit is a reason¬ 
able approximation for estimating the cosmic error extends up to values of 
k/k n i of order of a few. This is unfortunately not true for the full cosmic 
covariance matrix C[’ = Cov(Pfc., P fcj ), which deviates from the Gaussian pre¬ 
dictions (vanishing non-diagonal terms) as soon as k ~ k n \ [452,564], as we 
now discuss. 


6.5.4 The Covariance Matrix 

The covariance of the power spectrum, Eq. (367), can be easily written be¬ 
yond the Gaussian approximation neglecting shot noise and the window of the 
survey [452,564]^], 



2 P\h) 


+ T{ki, kj), 


(421) 


where 5^ is a Kronecker delta and T is the bin-averaged trispectrum, (416). 


67 See e.g. [293] for expressions including shot noise. 
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k, [Mpc/h] 


Fig. 38. The top panel shows the measured cosmic error on the power spectrum 
normalized by the Gaussian variance, obtained from the dispersion over 20 PM 
simulations of SCDM. The dashed line shows the predictions of PT, and the solid 
line the hierarchical scaling. The bottom panel shows the fractional error in the 
band-power estimates. This fractional error scales with the size of the survey or 
simulation box, the results in the figure correspond to a volume Vo = (100h _1 Mpc) 3 . 
Results for other volumes can be obtained by scaling by (Vo/F) 1 / 2 . The vertical 
line on the x-axis indicates the non-linear scale. The width of shells in fc-space is 
A k = 27t/ 100 h/Mpc. 

The first term in Eq. (421) is the Gaussian contribution. In the Gaussian limit, 
each Fourier mode is an independent Gaussian random variable. The power 
estimates of different bands are therefore uncorrelated, and the covariance is 
simply given by 2 /N^ where N^J 2 is the number of independent Gaussian 
variables. The second term in Eq. (421) arises because of non-Gaussianity, 
which generally introduces correlations between different Fourier modes, and 
hence it is not diagonal in general. 

Both terms in the covariance matrix in equation (421) are inversely propor¬ 
tional to V for a fixed bin size (recall that with the adopted convention P(k) 
scales like 1/V and T like l/I/ 3 ). But while the Gaussian contribution de¬ 
creases when Nk increases, the non-Gaussian term remains constant. There¬ 
fore, when the covariance matrix is dominated by the non-Gaussian contribu¬ 
tion the only way to reduce the variance of the power spectrum is to increase 
the volume of the survey instead of averaging over more Fourier modes. 

The importance of the non-Gaussian contribution to the cross-correlation be- 
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tween band powers was studied with numerical simulations in [452,564], in par¬ 
ticular [452] shows in detail that the correlations induced by non-linearities are 
not negligible even at scales k < k n \, in agreement with PT predictions [564], In 
the non-linear regime, as expected, the cross-correlations are very strong; in¬ 
deed, the cross-correlation coefficient r^ = C i3 / y T 'y ,(is very close to unity. 
Predictions for r V] from the hierarchical ansatz using HEPT amplitudes (see 
Sect. 4.5.6) are in reasonable agreement with simulations [564], although at 
large separations (ki kj ) there are significant deviations [564,296]. 

An efficient (although approximate) numerical approach to computing the 
covariance matrix of the power spectrum is presented in [571], using a com¬ 
bination of 2LPT at large scales, and knowledge about dark matter halos at 
small scales (see e.g. Sect. 7.1.3-7.1.4), which also allows to take into account 
the effects of redshift distortions and galaxy biasing. 

6.6 Generalization to Higher-Order Correlation Functions 


Higher-order statistics such as correlation functions in real and Fourier space 
were not studied in as much detail as the power spectrum and the two-point 
correlation function. In particular, there is no accurate analytic estimate of 
the cosmic bias and error on such statist ic^-p^j although a general formalism 
(relying on a statistical framework set up by Ripley [537]) which we summarize 
below was recently developed by Szapudi and collaborators [624,633,634], 

The LS estimator presented in Sect. 6.4.1 for the two-point correlation func¬ 
tion, (<5 i< 5 2 ), can be formally written as (Di — R\)(D 2 — R 2 )/RiR 2 - As sug¬ 
gested in [624], a simple generalization for a statistic of order N, for exam¬ 
ple the unconnected Appoint correlation function, f N = (Si . . .Sn), is simply 
(. Di — Ri)(D 2 — Ro) ■ ■ ■ (D n — R n )/Ri ... Rn- More exactly, [624,634] define 
symbolically an estimator D p R q with p + q = N for a function 0 symmetric 
in its arguments 

£^iP = 5:e(x 1 ,...,x J „y 1 ,...,y ff ) (422) 


with x.j t - x.j E D and y* 7 ^ y j E R are objects positions in the galaxy catalog 
and the random catalog respectively. The generalized LS estimator reads 


In 




(- 1 )*-* 



(423) 


where the normalization number S is given by 

S = j 0(xi,..., x Ar )d p xi... d v x N . (424) 

68 See however the attempt in [458] about estimating the error on £3 in various 
approximations. 
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If n g is determined with arbitrary accuracy the estimator (423) is unbiased, 
optimally edge corrected in the weak-correlation limit [624], For practical mea¬ 
surements, however, n g is determined from the catalog itself, and the integral 
constraint problem arises again, as described in Sect. 6.4.3. 

The cosmic covariance of /,v assuming that n g is perfectly determined was 
given in [634], 


Cov(/jv 1 , /)v 2 ) = {fNi,afN 2 ,b) ~ (fN lt a) (fN 2 ,b) 


= — T 

S2 h 


1( 


\ 


No 


(~i) i+j [E{i,j,N x ,N 2 ) 


V 3 


— ‘S’o{(1 7 • • • > i + 1, • • •, Ni + j )}], 


with 


/ f D\ pl / i?\ 
E(p 1 ,p 2 ,N 1 ,N 2 ) = l — ( —) 

\\ n g J \n r / 


_ / ( D \ P1 (R\ Nl ~ Pl ( d\ P2 /r\ N2 ~ P2S 

V / V n r / , 


/ 


Pi 

i 


( 


= E 

i 

where the operator Si is defined by 


P2 


i\ n iSj{/jv 1 +p 1 +p2—*} 


V 


(425) 


(426) 


Sk{g} = J d P Xi . . . d 

e o (l, ■ ■ •, Ad) © 6 (1,..., K N, + 1, ., N, + N 2 - k) 

g{ 1, ■ • ■ ,Pi, Nx + 1,..., Ni +p 2 - k), (427) 

and the convention that ( k t ) is nonzero only for k > 0, l > 0 and k > l. In 
these equations we have used the short-hand notations, 1 = xi, ..., i = x*, 
etc., and g should be viewed as /*( 1,..., i)fj(Ni + 1,..., N\ + j) in Eq. (427) 
to compute the So term in Eq. (425). 

Equation (425) assumes that the random catalog contains a very large num¬ 
ber of objects, n r —> oo, i.e. does not take into account errors brought by the 
finiteness of N r (see [634] for more details). Using a computer algebra pack¬ 
age, one can derive from this formalism Eq. (394). Similar but cumbersome 
expression for the three-point correlation function can be found in [634], 
Note, as suggested in [624], that this formalism can be applied to Fourier 
space, i.e. to the power-spectrum (see [636] for a practical implementation 
of estimator f 2 in harmonic space) and to the bispectrum. It can also be 
theoretically applied to one-point distribution functions, such as count-in-cells, 
studied below, but it was not done so far. Therefore, we shall instead present 
results relying on a more traditional approach in the next section. 

Note that for the bispectrum, some work has been done in computing its 
covariance matrix and cosmic bias in particular cases. In [434], the bispectrum 
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covariance matrix is estimated including shot-noise terms and beyond the 
Gaussian approximation! 777 ] by using second-order Eulerian PTp 77 ]. A numerical 
calculation of the bispectrum covariance matrix and the cosmic bias expected 
for IRAS surveys is presented in [566] using 2LPTf rr l. 


6.7 One-Point Distributions: Counts-in-Cells 
6.7.1 Definitions: 

The Count Probability Distribution Function (CPDF) was introduced in 
Sect. 6.3.2. Here we give more definitions on count-in-cells statistics, such as 
factorial moments and their relation to cumulants and the CPDF in terms 
of generating functions. Some additional information can be found as well in 
Appendix E. 

Following the presentation in Sect. 6.3.2, we discuss in more detail here an ele¬ 
gant way of correcting for discreteness effects, which makes use of the factorial 
moments. These are defined as follows: 

F k = ((N) k ) = (N(N - 1) ■ • • (N - k + 1)) = ]T(A0 k P N . (428) 

N 


Note thus that N — ip. We have 

F k = N k ({l + 5) k ), (429) 

so F k /N k estimates directly the moment of order k of the underlying (smoothed) 
density held. 

The generating function of the counts 

V(t) = J2t N P N (430) 

N 

is related to the moment generating function through 

M(Nt)=V{t + 1). (431) 


69 Estimation of the cosmic error in the Gaussian approximation is given in [234,560]. 
' 0 This is however only approximate since a consistent calculation of the con¬ 
nected six-point function requires up to fifth-order Eulerian PT, a quite complicated 
calculation. 

' 1 This is also not a consistent calculation of non-Gaussian terms in the covariance 
matrix; however 2LPT does include significant contributions to any order in Eu¬ 
lerian PT, and comparison for one-point moments suggest 2LPT is a very good 
approximation [561]. 
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Factorial moments thus verify 


F k 



k 

V(t+ 1) 


t =o 


( 432 ) 


It is easy to find, using Eq. (141), the following useful recursion [619] relating 
factorial moments to quantities of physical interest, S p , 


S P = 


m 


i p- 1 

-E 


/ p \ (p — q)S p _ q F q 

U 




(433) 


where N c is the typical number of object in a cell in overdense regions, N c = 

N 6- 


6.7.2 Estimators 

In practice, the measurement of the CPDF and its factorial moments is very 
simple. It consists of throwing C cells at random in the catalog and computing 

Fn = 77 E ^ N it N j (434) 

L ' i =i 

where 8n,m is the Kronecker delta function, and N t denotes the number of 
objects in cell “i”. Similarly, the estimator for the factorial moment of order 
k is 


( 435 ) 

L ' i =i 

or can be derived directly from Pjv using Eq. (428). Estimators (434) and (435) 
are unbiased. However, if one uses the relation (433) to compute cumulants 
from factorial moments, i.e. 


F\(F 3 — 3FiF 2 + 2 F^) 

(A - A 2 ) 2 ’ 

A 2 (A - 4 Fj\ - 3 F[ + 12 A A 2 - 6 E) 

(A - A 2 ) 3 


(436) 

(437) 

(438) 


the corresponding estimators are biased, because nonlinear combinations of 
estimators are generally biased (e.g. [328,630]). 

To reduce the bias and the errors on direct measurements of cumulants from 
Eqs. (436), (437), (438) it is possible to use some prior information, for example 
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by assuming that the PDF of the underlying density field is given by the 
Edgeworth expansion, Eq. (144), convolved with a Poisson distribution to take 
into account discreteness, Eq. (369). This procedure was actually applied to 
the IRAS 1.2Jy galaxy catalog [377]. The advantage of such a method is that it 
can be less sensitive to finite volume effects by using the shape of the PDF near 
its peak (since finite volume effects mainly affect the tails). One disadvantage, 
is that the validity of the Edgeworth expansion is quite restricted, even in 
the weakly non-linear regime (see, e.g. [356]). In particular, the PDF is not 
positive definite. Convolution with the Poisson distribution to account for 
discreteness alleviates this problem for the sparse IRAS surveys [377]; however, 
for applications to the next generation of galaxy surveys this will likely not 
be the case. Another difficulty of this approach is that error estimation is not 
straightforward. On the other hand, the idea of using prior information on 
the shape of the PDF to estimate moments is certainly worth pursuing with 
a more detailed modeling of the density PDF. 

6. 7 .3 Error Propagation: Cosmic Bias vs. Cosmic Error 
We now review the theory of error propagation in a general setting for func¬ 
tions of correlated random variables, following the treatment in [630][[3]- This 
theory was actually behind the calculation of the errors on the two-point cor¬ 
relation function in Sect. 6.4. Since the calculations are necessarily technical, 
we only present computations of the cosmic bias and error on nonlinear esti¬ 
mators such as those given by Eqs. (436), (437) and (438). 

Let us suppose that we measure a quantity /(x), where x is a vector of unbi¬ 
ased estimators, such as the factorial moments, and that the measurement of x 
is sufficiently close to the ensemble average (x) = x. Then / can be expanded 
around the mean value 

/(x) = /(x) + E + 1E dkfiA + 0(Sx% (439) 

where x k is the k -th component of x and 

5x k =x k -x k . (440) 

After ensemble average of Eq. (439) one obtains 

(/> = /(x) + i E + 0(fa s ). (441) 

k,l k l 


72 For a different approach, based on an expansion in terms of the variance at the 
scale of the survey see [328]. 
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To second order the cosmic bias [Eq. (363)] thus reads 


bf — 


d 2 f 


-(5x k Sxi). 


2/(x) dx k dxi 

Similarly the covariance between two functions / and g is 
Co <v(f,g) = ( SfSg) = J2 f/ ff (^kSx t ) + 0(5x 3 ). 


( 442 ) 


k,l 


dx k dxi 


(443) 


In particular, the relative cosmic error is given by 

a f = jj^ = \/Cov(f, /)/(/)■ (444) 

It is important to notice the following point, from Eqs. (442) and (443): 


bf~0(a 2 f ). 


(445) 


The range of applicability of this perturbative theory of error propagation is 
(SxkSxi)/x k Xk 1: errors and cross correlations of the vector x must be weak. 
In this regime the cosmic bias is always smaller than the relative cosmic error, 
except for accidental cancellations in Eq. (442) (in that case, the next order 
would be needed in the expansion). When the cosmic bias becomes large the 
expansion in Eq. (439) breaks down; in this case, numerical simulations show 
that the cosmic bias can be larger than the relative cosmic error [328]. 


6.7.4 Cosmic Error and Cross-Correlations of Factorial Moments 
According to the above formalism, the knowledge of errors and cross-correla¬ 
tions on a complete set of unbiased estimators, such as the factorial moments, 
Fk, k = 1,..., 00 , or count-in-cells themselves, Pn, allows the calculation of 
the cosmic error (or cross-correlations) on any counts-in-cells statistics. The 
general theoretical framework for computing the cosmic error on factorial mo¬ 
ments can be found in [621] and [630][[3]. Here we review the main results. 
First, it is important to notice that there is a source of error due to the 
finiteness of the number of cells C used in Eqs. (434) and (435). This source of 
error, which is estimated in [621], can be rendered arbitrarily small by taking 
very large number of sampling cells, C, or by using an algorithm equivalent 
to infinite sampling, C —> 00 as proposed in [625]. Measurements are often 
done using C — V/v, i.e. the number of cells necessary to cover the sample, 
which is not a good idea. Indeed, such small number of sampling cells does 
not, in general, extract all the statistically significant information from the 
catalog, except in some particular regimes in the Poisson limit. The best way 

73 See the earlier work in [149] for detailed calculations of the void probability func¬ 
tion cosmic error. 
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to measure count-in-cells statistics is thus to do as massive oversampling as 
possibl^ 2f] and estimate the cosmic error independently, as explained below. 
Similarly, when measuring the two-point correlation function using a Poisson 
sample R to estimate RR and DR, in order to avoid adding noise to the 
measurements, the random catalog R should have as many objects as possible. 
Having that in mind, we shall assume from now that C is very large. 

The error generating function is defined as follows 


S{x,y) = £ [(P N P M ) ~ (Pn)(Pm)} , 

N,M 


(446) 


where the ensemble average (...) denotes the average over a large number of 
realizations of the catalog with same geometry and same underlying statistics. 
Then, the cosmic covariance on factorial moments and count-in-cells reads 


Ay = Co v(F t ,F,) = (JD + 1.1/+ 1) 


x=y =0 


d 


N / n \ M 


d \ 


Cov(P K , P M ) = hgl bb V ) 


(447) 

(448) 


x=y=0 


The error generating function can be written in terms of bivariate distributions 
1 


£{x,y) = i J d D r 1 d v r 2 [V(x,y) -V(x)V(y)) 


(449) 


v 


In this equation, V is the volume covered by cells included in the catalog and 
P(x, y) is the generating function of bicounts Pn,m for cells separated by a 
distance |i~i — r 2 | (see also Sect. 6.8 below): 

V(x,y)= J2x N y M P N , M . (450) 

N,M 

The calculation of the function S{x,y ), detailed in Appendix F, is simplified 
by separating the integral in Eq. (449) into two components, A OV eriap ( x i 1-j) and 
-E'disjoint(x, |/), according to whether cells overlap or not. 

At leading order in v/V, A^ has three contributions 

A m = A^ + A^ + A°, (451) 

where Af t , Afj and A^ are the hnite volume, edge and discreteness effect 
contributions, respectively. From [621] and [630], the Erst few terms in the 
three-dimensional case are listed in Appendix F. 

' 1 This is because missing clusters cores, which occupy a very small fraction of the 
volume, leads to underestimation of higher-order moments. 
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The finite-volume error comes from the disjoint cells contribution in the er¬ 
ror generating function. The corresponding relative error, or cross-correlation, 
A h/(F k Fi) does not depend on the number of objects in the catalog, and 
is proportional to the integral of the two-point correlation function over the 
survey volume: 

i(L) = i J d v r 1 d v r 2 C(r 12 ). (452) 

The edge effect term, Af i /(T).T)), is the contribution remaining from over¬ 
lapping cells in the continuous limit, N oo. It does not depend on the 
number of objects in the catalog and is proportional to £v/V. A pure Poisson 
sample does not have edge effect error at leading order in v/V, in agreement 
with intuition. The discreteness effect error, /Sf/J (F^F/), is the contribution 
from overlapping cells which depends on N and thus disappears in the con¬ 
tinuous limit. As discussed in the introduction of this chapter, the separation 
between these three contributions is useful but somewhat arbitrary. For ex¬ 
ample, Eq. (452) actually contains some edge effects through the constrain 
ri 2 > 2 R, as shown in Appendix F. 

Furthermore, if next to leading order contributions in v/V are considered, 
the corresponding correction is proportional to the contour of the survey, 
dV [537,154], Each contribution, A^/ (F^Fi), X = F, E or D contains a term 
proportional to dV. This correction is an edge correction, leading to terms 
such as edge-finite-volume and edge-discreteness contributions in our nomen¬ 
clature. 

It is important to emphasize that the expressions given in Appendix F are of 
direct practical nsepH for estimating errors on factorial moments or on cumu- 
lants (Sect. 6.7.5) using the theory of propagation of errors explained above 
(e.g. [319,632,635] for applications to actual measurements in real galaxy cat¬ 
alogs). Similarly as in Eq. (395), a careful examination of these expressions 
shows that prior knowledge of the shape of the two-point correlation function 
£ [namely, £ and £(L)] and higher-order statistics, S p and C pq up to some 
value of p and q is necessary to compute A k j. To estimate cumulants £ and 
S p , one can simply use the values directly measured in the catalog or other 
existing estimates (e.g. [249,622]), as well as existing fitting formulae for £ 
([289,493,335,494], see Sect. 4.5.4) and PT, EPT ([151], see Sect. 5.13) or 
HEPT ([563], see Sect. 4.5.6) for S p . To compute £(L) it is necessary to make 
assumptions about the cosmological model. The cumulant correlators C pq can 
be estimated directly from the catalog or from various models which further 
simplify the calculations (e.g. [41,619,630]). These models can be particular 
cases of the hierarchical model, Eq. (214), or can rely on PT results (Sect. 5.12) 
or extensions such as E 2 PT (Sect. 5.13). 


75 They have been implemented in the publically available FORCE package [630]. 
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Among the models tested, the best known so far is E 2 PT as illustrated by Fig¬ 
ure 39. In this figure, taken from [153], the cosmic error on factorial moments is 
measured from the dispersion over 4096 subsamples of size L = 125 hr 1 Mpc, 
extracted from a rCDM simulation of size 2000 h Mpc involving 1000 3 par¬ 
ticles [206]. The accuracy of theoretical predictions is quite good, especially 
at large, weakly nonlinear scales. At small scales, all the models tend to over¬ 
estimate the magnitude of the errors, including E 2 PT, but the disagreement 
between theory and measurements is at most a factor two approximately. This 
discrepancy suggests that details of the dynamics still need to be understood 
in order to describe appropriately multivariate distribution functions in the 
highly nonlinear regime. 



Fig. 39. The relative cosmic error on factorial moments measured as a function 
of scale [153], obtained from the dispersion over a large ensemble of subsamples 
extracted from one of the Hubble volume simulations [206], as explained in the text. 
The dotted, dashed, long dashed, dot-long dashed curves correspond respectively 
to theoretical predictions based on two particular cases of the hierarchical model, 
namely SS and BeS, E 2 PT and PT. The SS model [619] assumes Qnm = Qn+m 
with the definition in Eq. (F.24). The BeS model [41] is more complicated, but 
obeys Qnm = QniQmi, as in the E 2 PT framework, described in Sect. 5.13. The 
PT results are shown only in the weakly nonlinear regime, £ < 1. 
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6.7.5 Cosmic Error and Cosmic Bias of Cumulants 

Using the results in Sects. 6.7.3 and 6.7.4 it is possible to compute the cosmic 
bias and the cosmic error on estimators (436) (437) and (438) (see also [328]). 
It would be too cumbersome to put all the results here, but getting analytic 
expressions similar to what was obtained for is very easy with standard 
mathematical packages. For example, simple algebraic calculations give for the 
cosmic bias 


,_ F‘i /3A n 2 Ai 2 \ 

“ JW 2 V N 2 ~ NF 2 ) ’ 


bs 3 


% 3 - 3 % - 


2A 2 3 3A 22 

"fT 


(453) 

(454) 


with 


% - f ^ 


76 A n 

3Ai 3 \ 

3 F2 

/3An 

2Ai 2 \ 

V N 2 

nf 3 ) 


l N 2 

nf 2 ) 


Similarly, the cosmic errors read 


u| ~ i (4F|A n - ANF 2 A 21 + N 2 A 22 ) , 


(455) 


(456) 


a! ~ 


(2 N 3 F 2 - 6 NF 2 + 3 N 2 F 3 + F 2 F 3 f A n 


s 3 n V2 ^SI 

+ 2 N (-2 N 6 F 2 + 12 N a F 2 - 18IV 2 F 3 - 3 N 5 F 3 


+ 4 N 3 F 2 F 3 + 15AF|F 3 - 6 N 2 F 2 - 2F 2 F 2 ) A 12 
+ 2IV 3 £ (2 N 3 F 2 - 6 NF 2 + 3 N 2 F 3 + F 2 F 3 ) A 13 
+ N 2 (lV 3 - 3 NF 2 + 2F 3 ) 2 A 22 
+ 2IV 4 £ (iV 3 - 3 NF 2 + 2 F 3 ) A 23 + N 6 £ 2 A 33 . 


(457) 


It is interesting to compare the results obtained for £ to what was derived for 
function £(r). For example, replacing A ki and F}. with their value as functions 
of N and cumulants leads to the following result for the cosmic bias in the 
3-D case [630] 


% 




( oo4 4)w + ( ie - 5 - 76s - 

+ (3-2C 12 -Ij«L). 



& 

V 


(458) 


In this equation, valid in the perturbative regime (| b^\ -C -C 1) and at 
leading order in v/V one can recognize in the first, second and third terms 
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Fig. 40. Comparison of the cosmic errors for the factorial and connected moments 
expected in the SDSS [630]. Standard CDM is assumed for the two-point correlation 
function and E 2 PT with n e g = —2.5 for higher-order statistics. Solid, dotted, dash, 
and long dash lines correspond to orders 1 through 4, respectively. Of each pair of 
curves with the same line-types the one turning up on large scales relates to the 
cumulant. Note that the perturbative approach used to compute the cosmic error 
on the cumulants fails at large scales, explaining the right stopping point of the long 
dash curve for £ 4 . 

the discreteness, edge, and finite volume effect contributions, respectively. As 
expected, the last line is very similar to Eq. (392). Note that the discreteness 
effect term is rather small and can be neglected in most realistic situations, 
in agreement with Eq. (392). An alternative calculation of bg can be found 
in [328] with similar conclusions. 

Figure 40 displays the cosmic error as a function of scale for factorial moments 
and cumulants expected in the SDSS. It illustrates how these different estima¬ 
tors perform and shows that the relative error on the cumulants £, S3 and S 4 
is expected to be smaller 3, 5 and 15 percent, respectively in the scale range 
1 - 10 h~ l Mpc [630], 

6.8 Multivariate Count-in-Cells 

The generalization of count-in-cells to the multivariate case is quite straight¬ 
forward. Here we focus on bivariate statistics, which were used to compute 
the cosmic error on count-in-cells estimators in Sect. 6.7.4. 

For a pair of cells at position r x and r 2 separated by distance r = | r 4 — r 2 |, 
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factorial moment correlators [620] are defined as 


TT t t \ — Fki ~ F k F[ 

—, 

TT T _ -fd'O _ Fk 

Wk0 = ~]yk = tfk* 

where the joint factorial moment is given by 
F kl (r 12 ) = ((N) k ( TV),). 


(459) 

(460) 


(461) 


Similarly as factorial moments, F k i estimates joint moments of the smoothed 
density held 

F kl (r 12 ) = N k+l ([ 1 + 5( ri )] fc [l + 5(r 2 )] z >. (462) 

The joint factorial moments and thus the factorial moment correlators can 
be easily related to the quantities of physical interest, namely the two-point 
density normalized cumulants - also designed by cumulant correlators [623], 
C pq [Eq. (348)]. Indeed, as for the monovariate case, one can write 

F “=(^) (I) + + 

M(Nx, Ny ) = exp [C(x, y)] = V(x + 1 ,y + 1), (464) 

where V(x,y) is the generating function for bicounts defined previously in 
Eq. (450), A4(x,y) = (exp[a;5(r 1 )+7/5(r 2 )]) is the moment generating function 
(Sect. 3.3.3) and C(x, y) is the two-point density cumulant generating function 
[Eq. (138)]. For example, the first few cumulant correlators are [623] 


, (463) 

X=V= 


C 12 Z£ = W 12 -2Z, (465) 

Cl 3 f £ = W'j.3 - 3IE 12 - 3W 20 + 6£, (466) 

C 22 £ 2 £ = W 22 - 4W 12 + 4£ - 2£ 2 , (467) 

with £ = £(r 12 ). We have used the approximation Wu — £, valid when r 12 
R. 

An unbiased estimator for the joint factorial moment F k i analogous to Eq. (435) 
is simply, for a set of P pairs of cells in the catalog separated by distance r 
and thrown at random (with random direction), 

Pu(r) = F Y. [WMJYf)i + (NMN ,)*]. (468) 

pairs (i,j) 
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A possible (biased) estimator for the factorial moments correlators is then, for 
the same set of cells, 



(469) 


with the definition 


b p o = £ PW* + Ml] • 


(470) 


pairs (i,j) 


At this point, it is interesting to notice again that Wn can be used directly as 
an estimator of the two-point correlation function, if the cell size R is small 
compared to the separation r (e.g. [503,275]). In that case, the averages are 
done on sets of pairs of cells in a bin 0 as defined in Sect. 6.4.1. 

Further generalization to higher-order multivariate statistics is trivial. For 
example, Wm can be used to estimate the three-point correlation function 
(e.g. [275]), W\ 111 to estimate the four-point correlation function (e.g. [226]) 
and so on. 

6 .9 Optimal Weighting 

To optimize the measurements of 77-point statistics, the data can be given a 
varying spatial weight u>( r 1; ..., rjy) symmetric in its arguments and properly 
normalized. Furthermore, in realistic redshift surveys, the average number 
density of galaxies changes with distance r from the observer: 


iigM = 


(471) 


where 0(r) < 1 is the selection function. Now, the estimators defined so far 
are valid only for statistically homogeneous catalogs, i.e. with constant n g (r). 
One way to avoid this problem is to use volume limited catalogs. This method 
consists in extracting from the parent catalog subsamples of depth R{ such 
that the apparent magnitude of objects in these catalogs at distance r = R{ 
from the observer would be larger than the magnitude limit. Such a selection 
criterion renders the number density of galaxies in the subsamples independent 
of distance at the price of a significant information lossp 17 ] In order to be able 
to extract all the information from the catalog, it is however possible to correct 
the estimators for the spatial variation of h g (r). Moreover, the signal to noise 
can be further improved by appropriate choice of the weight function u. 


' 6 However, a number of volume-limited samples can be constructed from the parent 
catalog to compensate for this. 
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The generalization of Eq. (422) reads 


D p R q = 


E 


u;(xi,...,Xp,yi,...,y g ) 
0(xi).. ,0(x p )0(yi) . ..(/>(y q ) 


0( x i,..., x p , y l5 ..., y q ). (472) 


(We assume that same selection effects are applied to the random catalog R). 
Note that the weight could be included in the bin function 0, but we prefer to 
separate the idea of spatial weighting from the idea of binning. In principle, the 
binning can change slightly the nature of the measured statistic A in A® ^ A. 
Of course, up to now we have assumed that the binned quantity is always 
close to the quantity of interest, Aq ~ A, but this condition is not absolutely 
necessary: the binning function 0 can be chosen arbitrarily and determined a 
priori. Then the statistic of interest becomes Aq instead of the original A. For 
example, count-in-cells represent a particular choice of the binning function. 
On the other hand the spatial weight should not bring any change, i.e., the 
weighted quantity, should be, after ensemble average, equal to the real value 
(or at least, very close to it): (A e ^) = Aq. 

The optimal weight by definition minimizes the cosmic error. In what follows, 
we assume that n g and 4 >{r ) are externally determined with very good accu¬ 
racy. As a result the cosmic error for iV-point statistics is given by Eq. (425), 
with the obvious correction to the functional Si with the weights and selection 
function. The optimal weight can then be found by solving an integral equa¬ 
tion for the function oo [291,293,152], There are several methods to solve this 
equation, for example by pixelizing the data, thus transforming the integral 
into a sum. In this way, solving the integral equation corresponds to inverting 
a matrix. We shall come back to that in end of this section and in Sect. 6.11.2. 
Otherwise, it has been shown that within the following approximations, 

(1) the considered Wuplets occupy a region 1Z small enough compared to 
the size of the catalog that variations of function 0 in the vicinity of a 
W up let are negligible, 0(ri) ~ ... ~ 0(rjv); 

(2) edge effects are insignificant; 

(3) the function oo depends only on position r of the region 1Z, i.e. the varia¬ 
tions of oo within 1Z are negligible; 

the function cu(r) that gives the optimal weight for the two-point function 
(but it is likely to be the case for the higher order functions) appears to be a 
functional of the selection function only [291]. 
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Within this simplifying frameworkf 77 ], the solution for the optimal weight is 
very simple [291,152] 

u(r) oc l/a 2 (r), (473) 


where cr(r) is the relative cosmic error on the considered statistics in a statisti¬ 
cally homogeneous catalog with same geometry and same underlying statistics 
as the studied one, but with a number of objects such that its number density 
is n g (p(r). This result actually applies as well to Fourier space (at least for the 
power-spectrum [212]) and to counts-in-cells statistics [152], 

To find the optimal weight, one has to make assumptions about the higher- 
order statistics in order to compute the cosmic error, since the latter depends 
on up to the 2 k th order for estimators of k th order statistics. To simplify the 
calculation of er(r), the Gaussian limit is often assumed. This is valid only in 
the weakly nonlinear regime and leads to the following weight for the two-point 
correlation function, commonly used in the literature [410,291,462,196,293]: 


V ' [l/h g (r) + J(r)] 2 ’ 

where 

J(r) = j d T> r'£(r'). (475) 

r'<r 



In Fourier space the result is [212] 
^ [1/Vn g (r) + P(k)}^ 


(476) 


a result that can be easily guessed from Eq. (415). This equation is valid for 
{k, A k} 1/L where L is the size of the catalog in the smallest direction and 
A k is the width of the considered bin. 

Note that the function u(r) is of pairwise nature. It corresponds to weighting 
the data with 

-»■ n g (r)y/u(r). (477) 


Now, we turn to a more detailed discussion of optimal weighting in count-in¬ 
cell statistics. The problem of hireling the optimal sampling weight was studied 

'' Hamilton [293,294] developed a general formalism for optimizing the measure¬ 
ment of the two-point correlation function in real and Fourier space, relying on 
the covariance matrix of the statistic (<5(iy)<5(rj)), which would correspond to the 
binning function @(ri,r 2 ) = <5n( r i)^n( r 2)- He proposed a way of computing the 
optimal sampling weight without requiring these simplifying assumptions. 
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in [152], Similarly to Eq. (472), the weighted factorial moment estimator reads 


F c 

r k 


1 c (Ni) k u(Ti) 


(478) 


where (f>R,(r) is the average of the selection function over a cell. 

To simplify the writing of the cosmic error as a function of the sampling weight, 
the variations of the function u and of the selection function are assumed to 
be negligible within the cells, which is equivalent to points (1) and (3) above. 
Then the relative cosmic error <xp fc [ ui,<p\ = (A F^/F^) 2 is 

P 2 F k [w, 0] = <4 [w] + < 4 [u] + al[u,(j)}, (479) 

where the hnite volume, edge effect and discreteness contributions read, re¬ 
spectively 



<4 r 

Z(L)V l 


d 3 rid 3 r 2 u;(ri) u;(r 2 ) £(ri 2 ), 



(480) 

(481) 

(482) 


In these equations, there are terms such as <4 = <4[1] or a\ = cr^ [1]. They cor¬ 
respond to the hnite volume and edge effect errors in the case of homogeneous 
sampling weight. They do not depend on the number density and are given 
by analytical expressions in Appendix F. The term <4(r) is similar, but there 
is a supplementary r dependence because the average count N is proportional 
to the selection function <fi. 

Using Lagrange multipliers, it is easy to write the following integral equation 
which determines the optimal weight [152] 

i d 3 uw(u) £(|r - u| ) + [<4 + <T D( r )] a; ( r ) + A = 0. (483) 


The constant A is determined by appropriate normalization of the weight 
function 



(484) 


The solution of this integral equation can be found numerically. However, the 
approximation (473) was found to be excellent, i.e. almost perfectly minimizes 
the cosmic error [152], 
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Using the leading order theory of propagation of errors in Sect. 6.7.3, it is easy 
to see that these calculations apply as well to the variance and the cumulants, 
provided that errors are small enough: in Eqs. (436), (437) and (438), F k would 
be computed with Eq. (478), using the sampling weight minimizing the cosmic 
error of the cumulant of interest. 

This result shows as well that for a statistically homogeneous catalog, a weight 
unity uj = 1 is very close to optimal in most practical cases for count-in-cell 
statistics. This statement of course is not necessarily true for 77-point corre¬ 
lation functions, particularly if the catalog presents a complicated geometry. 
In that case, the use of a weight might help to correct for edge effects at 
large scales, although the LS estimator and its generalization to higher order 
performs already well in this respect with an uniform weight. For traditional 
counts-in-cells estimators, the finite extension of the cells prevents from cor¬ 
recting for edge effects. This is actually the main weakness of these statistics 
compared to the TV-point correlation functions, and often the latter are pre¬ 
ferred to the former, particularly when the geometry of the catalog is compli¬ 
cated by the presence of numerous masks which reduce considerably the range 
of scales available to counts-in-cells. 

Finally it is worth noting the following point: the optimal weight is actually 
difficult to compute, because it requires knowledge of statistics of order l < 2k 
for an estimator of order k. Therefore, the Gaussian limit, given by Eqs. (474) 
and (476) for functions £(r) and P(k) respectively was widely used in the lit¬ 
erature. However, this is rigorously valid only in the weakly nonlinear regime, 
where the shot noise error is likely to be negligible, implying a simple, uniform 
weight to be nearly optimal, unless the catalog is very diluted. Discreteness 
errors are less of a concern with modern surveys under construction, such as 
the 2dFGRS or the SDSS. 

Furthermore, it was noticed in [152] that the traditional volume limited sample 
method does almost as good as a single optimized measurement extracting all 
the information from the catalog, if the depth of the subsample, R i} is chosen 
such that for the scale considered signal to noise is approximately maximal. 
Of course, estimating the cosmic error is still a problem, but the advantage 
of the volume limited approach is that prior determination of the selection 
function is not necessary, which simplifies considerably the analysis. 

6.10 Cosmic Distribution Function and Cross-Correlations 

6.10.1 Cosmic Distribution Function and Likelihood 

For a set of (possibly biased) estimators, f = {f k }k=u<, let us define the 
covariance matrix as C k i = Co v{f k ,fi}- The extra-diagonal terms can be 
correlations between a given estimator (e.g. of the power spectrum) at different 
scales (as in Sect. 6.5.4), between different estimators at the same scale (e.g. 
factorial moments, see Sect. 6.10.2 below), or in general different estimators at 
different scales. Knowledge of these cross-correlations can in fact help to better 
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constrain theories with observations, because they bring more information on 
the shape of the cosmic distribution function. 

As mentioned in Sect. 6.2, the cosmic distribution function T is the probability 
distribution for an estimator given a theory (or class of theories parametrized 
in some convenient form), i.e. T = T(f (theory) is the probability of measuring 
f in a finite galaxy catalog given a theory. Knowledge of T(f (theory) allows 
one to extract constraint on cosmological parameters from the data through 
maximum likelihood analysis, where the likelihood function is given by the 
cosmic distribution function thought as a function of the parameters that 
characterize the theory (with f replaced in terms of the observed data). 

In particular, if the cosmic distribution function T is Gaussian, it is entirely 
determined once the covariance matrix C is known: 


T(f|C,f,b) 


1 

7(2^Tc 


exp 


-\Y. s hCuSf, 

z k,l 


(485) 


where C ” 1 and |C| are respectively the inverse and the determinant of the 
covariance matrix, f is the true value of the statistics in question (f = (f) for 
unbiased estimators) and b a vector accounting for possible cosmic bias. Both 
C and f (and b if non-zero) are calculated from theoretical predictions as a 
function of cosmological parameters. 

It is very important to note that the Gaussian assumption for T is in general 
different from assuming that the density field is Gaussian unless the estimator 
f corresponds to the density contrastpH. For this reason, Eq. (485) is not 
necessarily a good approximation for estimators that are not linear in the 
density contrast even if the underlying statistic of the density field is Gaussian. 
We shall come back to this point in Sect. 6.10.3. 

Why is it useful to take as f non-linear functions of the density contrast? The 
problem is that the assumption of Gaussianity for the density held itself is 
very restrictive to deal with galaxy clustering: it does not include information 
on higher-order moments which arise due to e.g. non-linear evolution, non¬ 
linear galaxy bias, or primordial non-Gaussianity. Since there is no general 
expression for the multi-point PDF of the density held which describes its 
non-Gaussian shapef 79 ], one must resort to a different approach. The key idea 
is that taking f to be a statistic^ 17 ] of the density held, it is possible to work in 
a totally different regime. Indeed, when the cosmic error is sufficiently small, 
there must be many independent contributions to f so that, by the central limit 
theorem, its cosmic distribution function should approach Gaussianityp]. On 
the other hand, the cosmic error becomes large when probing large-scales, 

78 In this case T is proportional to the density PDF. 

,9 The Edgeworth expansion, Eq. (144), in principle provides a way to accomplish 
this [5]. In practice, however, its regime of validity is very restricted. 

80 These are non-linear functions of the data, e.g. the power spectrum is quadratic. 

81 Note that, in contrast to the PDF of the density held, this limit is usually ap- 
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where there are not many independent samples; in this case, assumption of a 
Gaussian density held plus the nonlinear transformation involved in f leads 
to a useful guess about the asymptotic behavior of T. In practice, the specific 
shape of T must be computed for a given set of theories, and the limit of 
validity of the asymptotic forms discussed above should be carefully checked, 
as discussed further in Sect. 6.10.3. 

This remainder of this section is organized as follows. In Sect. 6.10.2 we dis¬ 
cuss about correlations between different statistics. As an example, we show 
how knowledge of the number of objects in a galaxy catalog can be used to 
reduce the error bar on the measurement of the two-point correlation func¬ 
tion. Then, in Sect. 6.10.3, we address the problem of non-Gaussianity of the 
cosmic distribution function. 


6.10.2 Cross-Correlations Between Different Statistics 

An important kind of cross-correlation is given by that between statistics of 
different kind. For example, the calculation leading to Eq. (400) is a conditional 
average with the constraint that the average number density is equal to the 
observed one: 

(A^|n g ) 2 = (f |n g ) - (f|ra g > 2 

/£ 2 r(e,h g )d£ 

/mn g )d£ 

The knowledge of this supplementary information decreases the expected error 
on the measurement of £(r) and provides better constraints on the models. 
The calculation of Bernstein leading to Eq. (395) does not make use of the 
fact that h g can be measured separately: 


m,n g )d£ 


(486) 

(487) 


(a|) 2 =<I 2 }-<o 2 

= / £ 2 T(£,n g )d£dn g -* J fT(f,n g )dfdn g 


(488) 

(489) 


and therefore slightly overestimates the error on £(r) as emphasized in [393]. 
For example, if the function T is Gaussian, we have 


(A£|n g ) 2 = (A|) 




(490) 


where the correlation coefficient pab is defined for estimators A and B as 


Pab 


(( A-(A))(B-(B ))) 
aAab 


( 491 ) 


proached at small scales, we shall discuss examples below in Sect. 6.10.3. 
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From this simple result, we see that joint measurement of (theoretically) more 
correlated or anti-correlated statistics brings better constraints on the under¬ 
lying theory. 

In [630] and as described in Sect. 6.7.4, cross-correlations between factorial 
moments are computed analytically at fixed scale. From the theory of prop¬ 
agation of errors, it is straightforward to compute cross-correlations between 
other count-in-cells statistics of physical interest, such as average count N, 
variance £ and cumulants S p . Theoretical calculations and measurements in 
numerical simulations [630,153] show that, for realistic galaxy catalogs such 
as the SDSS, N and £ are not, in general strongly correlated, and similarly 
for correlations between N and higher-order statistics. Interestingly, £ and S 3 
are not very strongly correlated, but S 3 and S 4 are. Actually, in general and 
as expected, the degree of correlation between two statistics of orders k and l 
decreases with \k — l\. 

6.10.3 Validity of the Gaussian Approximation 

We now discuss the validity of the Gaussian approximation, Eq. (485), for the 
cosmic distribution function. To illustrate the point, we take two examples, 
the first one about count-in-cells statistics, the second one about the power- 
spectrum and bispectrum. 

Exhaustive measurements in one of the Hubble volume simulations [631] show 
that for count-in-cell statistics, Y(A) is approximately Gaussian if A A/A < 
0 .2. Therefore, at least for count-in-cells, Gaussianity is warranted only if 
the errors are small enough. When the cosmic errors become significant, the 
cosmic distribution function becomes increasingly skewed, developing a tail at 
large values of A [631]. This result applies to most counts-in-cells estimators 
(Pjv, Fk, £, S p ). One consequence is that the most likely value is below the 
average, resulting in an effective cosmic bias, even for unbiased statistics such 
as factorial moments: typically, the measurement of a statistic A in a finite 
catalog is likely to underestimate the real value, except in some rare case 
where it will overestimate it by a larger amountP 7 ]. To take into account the 
asymmetry in the shape, it was proposed in [631] to use a generalized version 
of the lognormal distribution, which describes very well the shape of function 
Y(A) for a single statistic, as illustrated by Fig. 41: 

T(i) =- t--- 

AA[s(A — A)/AA + l]y / 27rrj 

{ln[s(A — A)/AA + 1] + p/2} 2 

2 p 

82 This is of course analogous to non-Gaussianity in the density PDF. Positive skew¬ 
ness means that the most likely value is to underestimate the mean, see Eq. (230). 
To compensate for this there is a rare tail at large values compared to the mean, 
see e.g. Fig. 20. 


(492) 


x exp 


169 






77 = ln(l + s 2 ), 


(493) 


where s is an adjustable parameter. It is fixed by the requirement that the 
analytical function Eq. (492) have identical average, variance, and skewness 
S 3 = 3 + s 2 , as the measured T(A). 

However, the generalization of Eq. (492) to multivariate cosmic distribution 
functions is not easy, although feasible at least in some restricted cases (e.g. 
see [585]). An alternate approach, would employ a multivariate Edgeworth 
expansion [5]. 

Since the Gaussianity of the cosmic distribution function mainly depends on 
the variance of the statistic under consideration, it is expected that for sur¬ 
veys where errors are not negligible, Gaussianity is not a good approximation. 
Figure 42 illustrates this for IRAS surveys in the case of the power spec¬ 
trum and bispectrum [565], as a function of normalized variables, 5A/AA = 
(A — A)/((A — A) 2 ) 1 / 2 . For the bispectrum, this choice of variable makes the 
cosmic distribution function approximately independent of scale and configu¬ 
ration. 

The left panel of Fig. 42 shows the power spectrum cosmic distribution func¬ 
tion as a function of scale, from least to most non-Gaussian, scales are k/kf = 
1 - 10, k/k f = 11- 20, k/kf = 21- 30, k/kf = 31 - 40, where k f = 0.005 
h/Mpc. As expected, non-Gaussianity is significant at large scales, as there are 
only a few independent modes (due to the finite volume of the survey), and 
thus the power spectrum PDF is chi-squared distributed. As smaller scales are 
considered, averaging over more modes leads to a more Gaussian distribution, 
although the convergence is slow since the contributing modes are strongly 
correlated due to shot noise. 

The right panel in Fig. 42 shows a similar plot for the bispectrum. In sparsely 
sampled surveys such as QDOT, deviation from Gaussianity can be very sig¬ 
nificant. In a large volume limited sample of 600 Mpc/h radius with many 
galaxies (dotted curve), Gaussianity becomes an excellent approximation, as 
expected. The cosmic distribution function for y 2 initial conditions was also 
calculated in [565]; in this case non-Gaussianity is significant even for large 
volume surveys, and thus must be taken into consideration in order to properly 
constrain primordial non-Gaussianity [567,211], 

6.11 Optimal Techniques for Gaussian Random Fields 

Up to now, we have restricted our discussion to a particular subset of estima¬ 
tors used commonly in the literature, which apply equally well to two-point 
and higher-order statistics. To give account of recent developments, we now 
reinvestigate the search for optimal estimators in the framework of Gaussian 
random fields. That is, the cosmic distribution function, with estimators f 
that will be taken as density contrasts (measured in pixels or their equivalent 
in some space of functions, such as spherical harmonics), will be assumed to 


170 




0 5 10 


<5f/A£ 



<5£/A£ 



5S 3 /AS 3 



0 5 10 


<5S 4 /AS 4 



0 5 10 


AS3/AS3 



0 5 10 


<5S 4 /AS 4 


Fig. 41. The cosmic distribution function of measurements Y(£) (upper line of pan¬ 
els), T(5 3 ) (middle line of panels) and T(54) (lower line of panels) measured from 
a distribution of subsamples extracted from a Hubble volume simulation (see end 
of Sect. 6.7.4 for more details). The scale of the measurements, either R = 1, 7.8 
or 62.5h -1 Mpc, is indicated on each panel. The solid, dotted and dash curves 
correspond to the Gaussian, lognormal and generalized lognormal [Eq. (492)] dis¬ 
tributions, respectively. With the choice of the coordinate system, the magnitude of 
the cosmic error does not appear directly, but is reflected indirectly by the amount 
of skewness of the lognormal distribution. 

be Gaussian. As discussed above, this approach is only justifiable to obtain 
estimates of the power spectrum (or two-point correlation function) at the 
largest scales, where Gaussianity becomes a good approximation. 

First we recall basic mathematical results about minimum variance and max- 
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Fig. 42. Left Panel: Power spectrum cosmic distribution function in a IRAS 
1.2Jy-like survey as a function of scale in logarithmic scale, smooth solid line denotes 
a Gaussian distribution. From least to most non-Gaussian, scales are k/kf = 1 — 10, 
k/kf = 11 — 20, k/kf = 21 — 30, k/kf = 31 — 40, where kf = 0.005 h/Mpc. Right 
Panel: Cosmic distribution function of 8 Q/AQ = (Q — Q)/AQ for different sur¬ 
veys in models with Gaussian initial conditions: 2nd order Lagrangian PT with 
256 3 objects in a volume of 600 Mpc/h radius (dotted), IRAS 1.2Jy (solid), IRAS 
2Jy (dashed), IRAS QDOT (long-dashed). The smooth solid curve is a Gaussian 
distribution. 

imum likelihood estimators (Sect. 6.11.1). In Sect. 6.11.2, we discuss optimal 
weighting for two-point statistics taking into account the full covariance ma¬ 
trix (compare to Sect. 6.9), and in Sect. 6.11.3 we briefly address techniques 
for obtaining uncorrelated estimates of the power spectrum, comparing with 
results discussed in previous sections when relevant. Finally we briefly describe 
the Karhunen-Loeve transform, useful for compressing large amounts of data 
expected in current and forthcoming surveys (Sect. 6.11.4). 

6.11.1 Maximum Likelihood Estimates 

The basic results given here are well known in statistical theory [610,690]. For 
more details and applications to optimal measurements of the power spectrum 
in cosmological data sets see e.g. [646,647,80,293]. 

Let’s assume that we have at our disposal some data x, say, a vector of di¬ 
mension N with the cosmic distribution function Y(x), which is Gaussian and 
can be expressed explicitly as a function of x and a set of unknown parame¬ 
ters f, which we aim to estimate, given our data. When thought as a function 
of the parameters f, Y(f) is usually known as the likelihood functior^]. The 
corresponding estimators, f = (fi, - • •, fx), K < N, are sought in the space 

83 Therefore, the assumption of a Gaussian density field means Y(x) as a function of 
x is Gaussian, whereas in the limit that a large number uncorrelated data contributes 
Y(f) becomes Gaussian. 
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of functions of the data x. The problem of finding an optimal estimator f 
can be formally approached at least in two ways, the first one consisting in 
minimizing the cosmic error on f, the second one consisting in maximizing the 
likelihood. 

We restrict ourselves to unbiased estimators, 


(f) = / d 7V x T(x|f) f(x) = f. 


(494) 


The search for the first kind of optimal estimator, already discussed in Sect. 6.9, 
consists in minimizing the cosmic error 


a 2 f k = ((h-h) 2 ), 


(495) 


given the constraint (494). It is useful at this point to assume that the like¬ 
lihood function is sufficiently smooth and to introduce the so-called Fisher 
information matrix 


_ / d 2 [— logT(f)] \ _ / (9log T(f) d log T(f) \ 

“ \ df k df, / \ af t df, /' 


(496) 


Let’s assume that the matrices F, and the covariance matrix C defined by 
Cki = Cov(/fc, fi ) = (SfkSfi) are positive definite. From the Cauchy-Schwarz 
inequality one gets the so-called Cramer-Rao inequality 


(A fk ) 2 F kk > 1 , 


(497) 


so that the inverse of the Fisher matrix can be thought as the minimum errors 
that one can achieve. Through a change of variable this inequality can be 
generalized in 


(a* • C ■ a) (b* • F • b) > (a 1 ■ b ) 2 , 

where a and b are two sets of constants. It implies 


(498) 


C| ~ IfT 


(499) 


An estimator f which obeys the equality in Eqs. (498) or (499) is called min¬ 
imum variance bound (MVB). This can happen if and only if the estimator f 
can be expressed as a linear function of the derivative of log-likelihood function 
with respect to the parameters: 


'<91og Y' 

df 


b = g( f) (f - f)* ■ a, 


(500) 


where the constant of proportionality g{f) might depend on the parameters 
but not on the data x. As a result, for an arbitrary choice of the parameters 
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f, minimum variance unbiased estimators are not necessarily MVB. The sec¬ 
ond way of seeking an optimal estimator consists in maximizing directly the 
likelihood function in the space of parameters, f —> f. The goal is to find f*ML 
such that 


T ( X )lf-fML(x, > T(*)|, 


(501) 


for any possible value of f. A practical, sufficient but not necessary condition 
is given by the solution of the two sets of equations 


<91og Y 

~dF~ 

d 2 log Y 
dfkdfi 


(502) 

(503) 


The solution of Eq. (501), if it exists, does not lead necessarily to an unbiased 
estimator nor a minimum variance estimator. But if by chance the obtained 
ML estimator is unbiased, then it minimizes the cosmic error. Moreover, if 
there is an MVB unbiased estimator, it is given by the ML method. Note that 
in the limit that large number of uncorrelated data contributes, the cosmic 
distribution function tends to a Gaussian and the ML estimator is asymptot¬ 
ically unbiased and MVB. In that regime, the cosmic cross-correlation matrix 
of the ML estimator is very well approximated by the inverse of the Fisher 
information matrix 


Cki = Cov(A,/ { ) = {SfkSfi) ~ (F- 1 )^. 


(504) 


On the other hand, from the Gaussian assumption for Y(x), it follows that 
the ML estimator for the power spectrum [P(k a ) = f a \ is the solution of 

- 1 dC ■ 

fa = - N U ) (505) 

(where 5 k denotes the density contrast at r^) for which the estimate is equal 
to the prior, f = f. That is, in order to obtain the ML estimator, one starts 
with some prior power spectrum f, then finds the estimate f, puts this back 
into the prior, and iterates until convergence. In Eq. (505), the Fisher matrix 
is obtained from Eq. (496), 

_ 1 dCjj r^-ii dC k i lADdl 

FaP 2d fJ C ]ik[C Jl df , (506) 

the covariance matrix + N tJ contains a term due to clustering (given 

by the two-point correlation function at separation r, — r 3 \, ), and a shot 
noise term N l3 = n l Sfj(r t — r j). Applications of the ML estimator to measure¬ 
ments of the 2-D galaxy power spectrum was recently done for the APM [203] 
and EDSGC [331] surveys (see Sect. 8.2.2). 
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6.11.2 Quadratic Estimators 

In reality it is in general difficult to express explicitly the likelihood function in 
terms of the parameters. In addition, even if we restrict to the case where the 
parameters are given by the power spectrum as a function of scale as discussed 
in the previous section, one must iterate numerically to obtain the ML esti¬ 
mates, and their probability distribution also must be computed numerically 
in order to provide error barsp 4 ]. As a result, a useful approach is to seek an 
optimal estimator, unbiased and having minimum variance, by restricting the 
optimization to a subspace of estimators, as discussed in Sect. 6.9. Of course, 
this method is not restricted to the assumption of Gaussianity, provided that 
the variance is calculated including non-Gaussian contributions. It turns out 
there is an elegant solution to the problem [293,296], which in its exact form 
is unfortunately difficult to implement in practice, but it does illustrate the 
connection to the ML estimate (505) in the Gaussian limit, and also provides 
a generalization of the standard optimal weighting results, Eqs. (474,476) to 
include non-Gaussian (and non-diagonal) elements of the covariance matrix. 
Since the power spectrum is by definition a quadratic quantity in the overden¬ 
sities, it is natural to restrict the search to quadratic functions of the data. 
In this framework, the unbiased estimator^] of the power spectrum having 
minimum variance reads [293,296] 

BC ■ 

U = F-p^rr[C~ x }mWi - N kl ), (507) 

where the variance is given by Eq. (504) and the Fisher matrix by Eq. (506) 
replacing §[C' _1 ] ifc [C'“ 1 ]j 7 with [C -1 ]^, where 

Cijki = ((SiSj — Nij — Qj)(S k Si — N k i ~ €ki)) (508) 

is the (shot noise subtracted) power spectrum covariance matrix. Here N.\j 
denotes the ‘actual’ shot noise, meaning that the self-pairs contributions to 
£ij are not included, see [296] for details. In the Gaussian limit, [C~ l ]i 3 ki —> 
(symmetrized over indices k and l) and the minimum variance 
estimator, Eq. (507), reduces to ML estimator, Eq. (505), assuming iteration to 
convergence is carried out as discussed above. If the iteration is not done, the 
estimator remains quadratic in the data, and it corresponds to using Eq. (505) 
with a fixed prior; this should be already a good approximation to the full ML 
estimator, otherwise it would indicate that the result depends sensitively on 
the prior and thus there is not significant information coming from the data. 
The use of such quadratic estimators in the Gaussian limit to measure the 
galaxy power spectrum is discussed in detail in [648], see also [647,646,80]. 

84 However, see [81] for an analytic approximation in the case of the 2-D power 
spectrum using an offset lognormal. 

85 This is assuming that the mean density is perfectly known. 
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Extension to minimum variance cubic estimators for the angular bispectrum 
in the Gaussian limit is considered in [307,245]. 

Note that, the full minimum variance estimator involves inverting a rank 4 
matrix, a very demanding computational task, which however simplifies sig¬ 
nificantly in the Gaussian limit where C factorizes. Another case in which the 
result becomes simpler is the so-called FKP limit [212], where the selection 
function n g (r) can be taken as locally constant, compared to the scale un¬ 
der consideration. This becomes a good approximation at scales much smaller 
than the characteristic size of the survey, which for present surveys is where 
non-Gaussian contributions become important, so it is a useful approximation. 
In this case the minimum variance pair weighting for a pair ij is only a func¬ 
tion of the separation a of the pair, not on their position or orientation, since 
Hi and Hj are assumed to be constants locally. As a result, the power spectrum 
covariance matrix can be written in terms of a two by two reduced covariance 
matrix, which although not diagonal due to non-Gaussian contributions, be¬ 
comes so in the Gaussian limit, leading to the standard result Eq. (476). We 
refer the reader to [296] for more details. 

6.11.3 Uncorrelated Error Bars 

Clearly, minimum variance estimates can be deceptive if correlations between 
them are substantial. Ideally one would like to obtain not only an optimal es¬ 
timator (with minimum error bars), but also estimates which are uncorrelated 
(with diagonal covariance matrix), like in the case of the power spectrum of a 
Gaussian field in the infinite volume limit. Once the optimal (or best possible) 
estimator f is found, it is possible to work in a representation where the cosmic 
covariance matrix C becomes diagonal, 

C -*3 = A jV jt (509) 

where the eigenvectors SEt,- form an orthonormal basis. A new set of estimators 
can be defined 

g^T^f, (510) 

which are statistically orthogonal 

(SgiSgj) = XAj = ^ C Tq 5 iy (511) 

These new estimators can in principle be completely different from the original 
set, but if by chance the diagonal terms of C are dominant, then we have g — f. 
In fact, if one takes the example of the two-point correlation function (or higher 
order) in case the galaxy number density is known, using the new estimator 
g is equivalent to changing the binning function 0 defined previously to a 
more complicated form. Among those estimators which are uncorrelated, it is 
however important to find the set g such that the equivalent binning function 
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is positive and compact in Fourier space and (g) ~ f, so that to keep the 
interpretation of the power in this new representation as giving the power 
centered about some well-defined scale [294,296]. 

The above line of thoughts can in fact be pushed even further by applying the 
so called “pre-whitening” technique to f: if f is decomposed in terms of signal 
plus noise, pre-whitening basically consists in multiplying f by a function h 
such that the noise becomes white or constant. If the noise is uncorrelated, this 
method allows one to diagonalize simultaneously the covariance matrix of the 
signal and the noise. When non-Gaussian contributions to the power spectrum 
covariance matrix are included, however, such a diagonalization is not possible 
anymore. However, in the FKP approximation, as described in the previous 
section, it was shown that an approximate diagonalization (where two of the 
contributions coming from two- and four-point functions are exactly diagonal, 
whereas the third coming from the three-point function is not) works extremely 
well, at least when non-Gaussianity is modeled by the hierarchical ansatz [296]. 
The quantity whose covariance matrix has these properties corresponds to 
the so-called prewhitened power spectrum, which is easiest written in real 
space [296] 

2f(r) 

l + [l + <£(r)] 1/2 ' 

Note that in the linear regime, f(k) reduces to the linear power spectrum; how¬ 
ever, unlike the non-linear power spectrum, £(/c) has almost diagonal cosmic 
covariance matrix even for nonlinear modes. More details on the theory and 
applications to observations can be found in e.g. [296,297] and [298,487,299] 
respectively. 

6.11.4 Data Compression and the Karhunen-Loeve Transform 
A problem to face is with modern surveys such as the 2dFGRS and SDSS, 
is that the data set x becomes quite large for “brute force” application of 
estimation techniques. Before statistical treatment of the data as discussed in 
the previous sections, it might be necessary to find a way to reduce their size, 
but keeping as much information as possible. The (discrete) Karhunen-Loeve 
transform (KL) provides a fairly simple method to do that (see e.g. [680,646] 
and references therein for more technical details and e.g. [487,443] for practi¬ 
cal applications to observations). Basically, the idea is to work in the space of 
eigenvectors tLj of the cross-correlation matrix M = (kx • kx 4 ), i.e. to diago¬ 
nalize the cosmic covariance matrix of the data, 

M A ; ^j. (513) 

where the matrix T is unitary, \F -1 = T ,t . A new set of data, y, can be defined 

y = \Ed • x, (514) 



£( 
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which is statistically orthogonal 


(■ SyiSyj) = Xi Sij = t 'fh • M ML, 6^. (515) 

The idea is to sort the new data from highest to lowest value of A ? > Data 
compression will consist in ignoring data y,- with A,; lower than some threshold. 
An interesting particular case of the KL transform is when the data can be 
decomposed in signal plus noise uncorrelated with each other [79]: 

x = s + n. (516) 

The signal and the noise covariance matrices read 

S = (5s-Ss*), N=(5n -Sh 1 ). (517) 

Then, instead of diagonalizing the cosmic covariance matrix of the data, one 
solves the generalized eigenvalue problem 

S = A j N • Mt) • N • = 1. (518) 

The new data vector given by Eq. (514) is statistically orthogonal and veri- 
hesP 5 ] 


(SyMj) = (1 + \) S i: j. (519) 

One can be easily convinced that this new transform is equivalent to a KL 
transform applied on the “prewhitened” data, (N 1 ) -1 / 2 ■ x, where 

N = (N*) 1/2 • N 1/2 . (520) 

The advantage of this rewriting is that the quantity A i can be now considered 
as a signal to noise ratio 1+Aj = 1+S/N. Data compression on the prewhitened 
data makes now full physical sense, even if the noise is inhomogeneous or 
correlated. 

The KL compression is generally used as a first step to reduce the size of the 
data set keeping as much information as possible, which can then be processed 
by the methods of ML estimation or quadratic estimation which otherwise 
would not be computationally feasible. The final results should be checked 
against the number of KL modes kept in the analysis, to show that significant 
information has not been discarded. Note that in addition, since the methods 
generally used after KL compression assume Gaussianity, one must check as 
well that modes which probe the weakly non-linear regime are not included in 
the analysis to avoid having undesired biases in the final results. 

86 In the approximation that the distribution of x is Gaussian, this also implies 
statistical independence. 
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6.12 Measurements in N-Body Simulations 


Measurements of statistics in iV-body simulations are of course subject to 
the cosmic error problem, but can be contaminated by other spurious effects 
related to limitations of the numerical approach used to solve the equations 
of motion. Transients, related to the way initial conditions are usually set up 
were already discussed in Sect. 5.7. Here, we first consider the cosmic error 
and the cosmic bias problems, which in practice are slightly different from the 
case of galaxy catalogs. Second, we briefly mention problems due to iV-body 
relaxation and short-range softening of the gravitational force. 

6.12.1 Cosmic Error and Cosmic Bias in Simulations 

Here we restrict to the case of iV-body simulations of self-gravitating collision¬ 
less dark matter. Most of simulations are done in a cubic box with periodic 
boundaries. The first important consequence is that the average number den¬ 
sity of particles, n g , is perfectly determined. 

The second consequence as mentioned earlier is that edge effects are inexis- 
tent. The only sources of errors are finite volume and shot noise. With the 
new generation of simulations, discreteness effects are in general quite small 
except at small scales or if a sparse synthetic catalog of “galaxies” is extracted 
from the dark matter distribution. Finite volume effects in simulations have 
been extensively studied in [147,149,150]. For these effects to be insignificant 
in measured moments or correlation functions of the density distribution, the 
simulation box size L has to be large compared to the typical size of a large 
cluster, the correlation length R 0 . Typically it is required that Ro < L/ 20. 
Even if this condition is fulfilled, the sampling scales (or separations) R must 
be small fractions of the box size in order to achieve fair measurements, typ¬ 
ically R < L/10. Indeed, because of finite volume effects, moments of the 
density distribution, cumulants and iV-point correlation functions tend to be 
systematically underestimated, increasingly with scale. This is a consequence 
of cosmic bias and effective bias due to the skewness of the cosmic distribution 
function, as discussed in Sect. 6.10. 

The estimation of cosmic bias was addressed quantitatively at large scales 
in [580] using PT where it was found that although moments can be affected 
by as much as 80% at smoothing scales one tenth of the size of the box (for 
n = —2), the skewness S :i was affected by at most 15% at the same scale. 
Finite volume effects for velocity statistics are much more severe, as they are 
typically dominated by long wavelength fluctuations, e.g. see [342], 

The most obvious consequence of finite volume effects is the fact that the high- 
density tail of the PDF develops a cutoff due to the finite number of particles. 
A method was proposed in [145,147,149] and exploited in other works [150,472] 
to correct the PDF for finite volume effects, by smoothing and extending to 
infinity its large-5 tail. Another way to bypass finite volume effects consists in 
doing several simulations and taking the average value (see, e.g. [356,251,28]) 
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of the moments or cumulants, with the appropriate procedure for cumulants 
to avoid possible biases. This is however, by itself not necessarily sufficient, 
because in each realization, large scale fluctuations are still missing due to 
the periodic boundaries (e.g. [580]). In other words, doing a number of ran¬ 
dom realizations of given size L with periodic boundaries is not equivalent to 
extracting subsamples of size L from a very large volume. With many realiza¬ 
tions one can reduce arbitrarily the effect of the skewness of the distribution, 
but not the influence of large-scale waves not present due to the finite volume 
of the simulations. 

6.12.2 N-Body Relaxation and Force Softening 

Due to the discrete nature of numerical simulations, there are some dynam¬ 
ical effects due to interactions between small number of particles. To reduce 
these relaxation effects it is necessary to bound forces at small interparticle 
separation, thus a softening e is introduced as discussed in Sect. 2.9. How¬ 
ever, this softening does not guarantee the fluid limit. The latter is achieved 
locally only when the number of particles in a softening volume e v is large. 
Typically, the softening parameter is of order the mean interparticle distance 
A in low-resolution simulations, or of order A/20 in high resolution simula¬ 
tions (Sect. 2.9). At early stages of simulations, where the particles are almost 
homogeneously distributed, relaxation effects are thus expected to be signifi¬ 
cant. Later, when the system reached a sufficient degree of nonlinearity, these 
effects occur only in underdense regions^/ . It is therefore important to wait 
long enough so that the simulation has reached a stage where typical nonlinear 
structures contain many particles. 

Statistically, this is equivalent to say that the correlation length should be 
much larger than the mean interparticle distance, Rq A [150]. This criterion 
is valid for most statistics but there are exceptions. For example, it was shown 
that the void probability distribution function can be contaminated by the 
initial pattern of particles (such as a grid) even at late stages [149]. Indeed, 
underdense regions tend to expand and to keep the main features of this initial 
pattern. Another consequence is that the local Poisson approximation is not 
valid if this initial pattern presents significant correlations or anticorrelations 
(such as a grid or a “glass” [28,688]). 

Finally, short-range softening of the forces itself can contaminate the mea¬ 
surement of statistics at small scales. With a careful choice of the timestep 
(see, e.g. [199]) the effects of the softening parameter are negligible for scales 
sufficiently large compared to e, a practical criterion being that the considered 
scale R verifies R = at with a of order a few [150]. 


87 In fact, in these regions, small but rare groups of particles experiencing strong 
collisions can be found even at late stages of the simulations. 
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7 Applications to Observations 


7.1 The Problem of Galaxy Biasing 

Application to galaxy surveys of the results that have been obtained for the 
clustering of dark matter is not trivial, because in principle there is no guaran¬ 
tee that galaxies are faithful tracers of the dark matter held. In other words, 
the galaxy distribution may be a biased realization of the underlying dark 
matter density held. 

A simplified view of biasing often encountered in the literature is that the two 
holds, galaxy and matter density helds, are simply proportional to each other, 

5 9 (x) = 6<5(x). (521) 

It implies in particular that the power spectra obey P g [k) = b 2 P{k). As long 
as one considers two-point statistics this might be a reasonable prescription; 
however, when one wants to address non-Gaussian properties, this is no more 
sufficient: the connection between dark matter fluctuations and galaxies, or 
clusters of galaxies, should be given in more detail. 

In principle, this relation should be obtained as a prediction of a given cosmo¬ 
logical model. However, although significant progress has been done recently 
to study galaxy formation from “first principles” via hydrodynamic numeri¬ 
cal simulations [122,369,72,498], they still suffer from limited dynamical range 
and rely on simplified descriptions of star formation and supernova feedback, 
which are poorly understood. This fundamental problem implies that when 
dealing with galaxies, one must usually include additional (non-cosmological) 
parameters to describe the relation between galaxies and dark matter. These 
parameters, known generally as bias parameters , must be determined from the 
data themselves. In fact, the situation turns out to be more complicated than 
that: since there is no generally accepted framework for galaxy biasing yet, 
one needs to test the parameterization itself against the data in addition to 
obtaining the best fit parameter set. 

The complexity of galaxy biasing is reflected in the literature, where many 
different approaches have emerged in the last decade or so. In addition to 
the hydrodynamic simulations, two other major lines of investigations can 
be identified in studies of galaxy biasing. The simplest one, involves a phe¬ 
nomenological mapping from the dark matter density held to galaxies, which 
is reviewed in the next section. Another approach, that has become popular 
in recent years, is to split the problem of galaxy biasing into two different 
steps [686]. First, the formation and clustering of dark matter halos, which 
can be modeled neglecting non-gravitational effects, this is the subject of sec¬ 
tions 7.1.2 and 7.1.3. This step is thought to be sufficient to describe the spatial 
distribution of galaxy clusters. The second step, discussed in section 7.1.4, is 
the distribution of galaxies within halos, which is described by a number of 
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simplifying assumptions about the complex non-gravitational physics. It is 
generally believed that such processes are likely to be very important in de¬ 
termining the properties of galaxies while having little effects on the formation 
and clustering of dark matter halos. 

Note that observational constraints on biasing (from higher-order correlations) 
are discussed in the next chapter (see Sections 8.2.6 and 8.3.5). 


7.1.1 Some General Results 

The first theoretical approach to galaxy biasing was put forward by Kaiser [360], 
who showed that if rich galaxy clusters were rare density peaks in a Gaus¬ 
sian random held, they will be more strongly clustered than the mass, as 
observed [503,15]. These calculations were further extended in [491,21]. In 
particular, it was found that rare peaks were correlated in such a way that 

(^eak) = ^eak (?) (522) 

where <5 pea k is the local density contrast in the number density of peaks with 
a bias parameter 

WH = - (523) 

a 

where a is the variance at the peak scale, and v is the intrinsic density con¬ 
trast of the selected peaks in units of a. These results led to studies of biasing 
in CDM numerical simulations [173,685], which indeed showed that massive 
dark matter halos are more strongly clustered than the mass. However, nu¬ 
merical simulations also showed later that dark matter halos are not always 
well identified with peaks in the linear density field [368]. 

An alternative description of biasing which does not rely on the initial density 
held, is the local Eulerian bias model. In this case, the assumption is that 
at scales R large enough compared to those where non-gravitational physics 
operates, the smoothed (over scale R) galaxy density at a given point is a 
function of the underlying smoothed density field at the same point , 

<5 g (x) — J 7 [5(x)], A(x) = J d 3 x'A(x — x / )hE(x / ) (524) 

|x'|<i? 


where W denotes some smoothing filter. For large R , where 6 <C 1, it is 
possible to perturbatively expand the function T in Taylor series and compute 
the galaxy correlation hierarchy [235]. Indeed, one can write 


00 h 

E u k f k 


(525) 


where the linear term h\ corresponds to the standard linear bias factor. In 
this large-scale limit, such a local transformation preserves the hierarchical 
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properties of the matter distribution, although the values of the hierarchical 
amplitudes may change arbitrarily. In particular [235], 


2 l2 z 

a g = b l a 

) 


Sg^K 1 

(S 3 + 3 C 2 ) 

SgA = K 2 

(s. 

4 T I 2 C 2 S 3 + 4c3 + 12(?2^ 

s g , 5 =b r 3 

s 5 

+ 20c 2 S 4 + 15 c2>S , 3 + (3 

+ 6 OC 2 

1 


(526) 


where Ck = bk/b\. As pointed out in [235], this framework encompasses the 
model of bias as a sharp threshold clipping [360,523,21,615], where 8 g — 1 for 
5 > ua and 5 g = 0 otherwise. Although it does not have a series representation 
around <5 = 0, such a clipping applied to a Gaussian background produces 
a hierarchical result with S 9tP = pP~ 2 in the limit v 3> 1, o <C 1. This is 
the same result as we obtain from Eq. (526) for an exponential biasing of 
a Gaussian matter distribution, 5 g = exp (aS/a), which is equivalent to the 
sharp threshold when the threshold is large and fluctuations are weak [21,615]. 
The exponential bias function has an expansion T = / a ) k /kl and thus 

bk — b\, independently of a and a. With S p = 0, the terms induced in Eq. (526) 
by bk alone also give S 9iP = pp~ 2 . 

As a result of Eq. (526), it is clear that for high order correlations, p > 2, 
a linear bias assumption cannot be a consistent approximation even at very 
large scales, since non-linear biasing can generate higher-order correlations. To 
draw any conclusions from the galaxy distribution about matter correlations 
of order p , properties of biasing must be included to order p — 1. 

Let us make at this stage a general remark. From Eq. (526) it follows that 
in the simplest case, when the bias is linear, a value b\ > 1 reduces the 
S p parameters and it may suggest that this changes how the distribution 
deviates from a Gaussian (e.g. the galaxy held would be “more Gaussian” 
than the underlying density held, given that S 3 is smaller). However, this is 
obviously an incorrect conclusion, a linear scaling of the density held cannot 
alter the degree of non-Gaussianity. The reason is that the actual measure of 
non-Gaussianity is encoded not by the hierarchical amplitudes S p but rather 
by the dimensionless skewness B 3 = 83 a, kurtosis B 4 = S 4 a 2 , and so on, 
which remain invariant under linear biasing. These dimensionless quantities 
are indeed what characterize the probability distribution function, as it clearly 
appears in an Edgeworth expansion, Eq. (144). 

Since Fourier transforms are effectively a smoothing operation, similar results 
to those above hold for Fourier-space statistics at low wavenumbers. In this 
regime, the galaxy density power spectrum P g [k) is given by 

P g (k) = b\P(k ), (527) 
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and the galaxy (reduced) bispectrum obeys [recall Eq. (154)] 

Qg( ki, k 2 , k 3 ) = —Q(ki, k 2 , k 3 ) + (528) 

As discussed in Section 4.1.3, Q given by Eq. (155), is very insensitive to cos¬ 
mological parameters and depends mostly on triangle configuration and the 
power spectrum spectral index. Since the latter is not affected by bias in the 
large-scale limit, Eq. (527), it can be measured from the galaxy power spec¬ 
trum and used to predict Q( k 1; k 2 , k 3 ) as a function of triangle configuration. 
As first proposed in [224,236], a measurement of Q g as a function of triangle 
shape can be used to determine 1/bi and b 2 /b\. So far, this technique has 
only been applied to IRAS galaxies [567,211], as will be reviewed in the next 
chapter (see Sect. 8.3. 3 0 

The results above suggest that local biasing does not change the shape of the 
correlation function or power spectrum in the large-scale limit, just scaling 
them by a constant factor b\ independent of scale. This derivation [235] as¬ 
sumes that the smoothing scale is large enough so that 5 -C 1, but in fact, it 
can be shown that this continues to hold in more general situations. For exam¬ 
ple, an arbitrary local transformation of a Gaussian field, leads to a bias that 
cannot be an increasing function of scale and that becomes constant in the 
large-scale limit, irrespective of the amplitude of the rms fluctuations [MOf 8 ^]. 
However, it is easy to show that if the underlying density field is hierarchical 
(in the sense that the C pq parameters in Eq. (348) are independent of scale), 
a local mapping such as that in Eq. (524) does lead to a bias independent of 
scale in the large-scale limit even if <5 7§> 1 [41,553]. 

Recent studies of galaxy biasing [553,180,72,440] have focused on the fact that 
Eq. (524) assumes not only that the bias is local but also deterministic; that 
is, the galaxy distribution is completely determined by the underlying mass 
distribution. In practice, however, it is likely that galaxy formation depends 
on other variables besides the density field, and that consequently the relation 
between S g (x) and <5(x) is not deterministic but rather stochastic, 

4(x) = -A[5(x)] + e*(x), (529) 

where the random field es(x) denotes the scatter in the biasing relation at a 
given 5 due to the fact that 5(x) does not completely determine S g (x). Clearly 
for an arbitrary scatter, the effects of £^(x) on clustering statistics can be 
arbitrarily strong. However, under the assumption that the scatter is local, 
in the sense that the correlation functions of £s(x ) vanish sufficiently fast at 
large separations (i.e. faster than the correlations in the density field), the 

88 Similar relations to Eq. (526) and Eq. (528) can be obtained for cumulant corre¬ 
lators, see [626]. 

89 but this is an unrealistic situation since Gaussianity breaks down when the rms 
fluctuations are larger than unity. 
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deterministic bias results hold for the two-point correlation function in the 
large-scale limit [553]. For the power spectrum, on the other hand, in addition 
to a constant large-scale bias, stochasticity leads to a constant offset (given 
by the rms scatter) similar to Poisson fluctuations clue to shot noise [553,180]. 
Another interesting aspect of stochasticity was studied in [440], in connection 
with non-local biasing. A simple result can be obtained as follows. Suppose 
that biasing is non-local but linear, then we can write 

S g (x) = / S(x')K(x — x')d 3 x', (530) 


where the kernel K specifies how the galaxy field at position x depends on 
the density field at arbitrary locations x'. This convolution of the density field 
leads to stochasticity in real space, i.e. the cross-correlation coefficient r 

^ Wx)W) 

\/4 

where s = |x — x'|, is not necessarily unity. However, due to the convolution 
theorem, the cross-correlation coefficient in Fourier space will be exactly unity, 
thus 


(531) 


<5 s (k)5(k')> = folk + k') b(k)P(k) (532) 

and 

(4(k)4(k')> = <fo(k + k ; ) b 2 (k)P(k), (533) 

where the bias b(k) is the Fourier transform of the kernel K. The study in [440] 
showed on the other hand that the real-space stochasticity (in the sense that 
r < 1) at large scales was weak for some class of models. At small scales, 
however, significant deviations from r < 1 cannot be excluded, for example 
due to nonlinear couplings in Eq. (530). However, without specifying more 
about the details of the biasing scheme, it is very difficult to go much beyond 
these results. 

Most of the general results discussed so far have been observed in hydrodynam- 
ical simulations of galaxy formation. For example, in [72] it has been obtained 
that at large scale (R > 15 Mpc/h) the bias parameter tends to be constant 
and the cross correlation coefficient r reaches unity for oldest galaxies. The 
authors stress that the bias shows a substantial scale dependence at smaller 
scales, which they attribute to the dependence of galaxy formation on the 
temperature of the gas (which governs its ability to cool). In addition, they 
observe a substantial amount of stochasticity for young galaxies (r « 0.5), even 
at large scales. However, these results are in disagreement with observations 
of the LCRS survey, where it was found that after correcting for errors in the 
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selection function the cross-correlation between early and late-type galaxies is 
r « 0.95 [71]. 

Another assumption that enters into the local Eulerian biasing model discussed 
above, is that the galaxy field depends on the underlying density held at the 
same time. In practice, it is expected that to some extent the merging and tidal 
effects histories affect the final light distribution. This can lead to non-trivial 
time evolution of biasing. For instance, as shown in [241], if galaxy formation 
was very active in the past but after some time it becomes subdominant, then 
in the absence of merging the galaxy density contrast is expected to follow the 
continuity equation, 

85 

a—^ + n.V5 g + (1 + 5 S ) V.u = 0 (534) 


where u is the peculiar velocity held of the dark matter held: galaxies are 
simple test particles that follow the large-scale hows. Formally this equation 
can be rewritten as 


dlog(l + S g ) _ d log(l + 5) 
dr dr 


(535) 


where d/dr is the convective derivative. As a consequence the galaxy density 
held is expected to resemble more and more the density held in terms of 
correlation properties: both the bias parameters, bk, and the cross-correlation 
coefficient, r, are expected to approach unity, galaxies “de-bias” when they 
just follow the gravitational held [482,241,649]. The higher-order moments 
characterized by S p are also expected to get closer to those for the dark matter 
held. These calculations have been illustrated in [241,641], 

One obvious limitation of these “galaxy conserving” schemes is the assumption 
that there is no merging, which is expected to play a central role in hierarchical 
structure formation. In addition, ongoing galaxy formation leads to galaxies 
formed at dif ferent redshifts with different “bias at birth”. Indeed, models 
based on the continuity equation predict a slower time evolution of bias than 
observed in simulations [73,599], i.e. galaxies become unbiased faster than 
when these effects are neglected. 

An interesting consequence of Eq. (535) has been unveiled in [119] where they 
remark that the solution is 


1 + 5g(x, z) = [ 1 + 6g (q)] [1 + 5(x, z)\ (536) 

where the galaxy held at the Lagrangian position q is obtained from the linear 
density held at q = x — \k(q, z) by 5 p ( q) = J2bf/k\5 L (q). That is, in this 
model, the bias is assumed to be local in Lagrangian space rather than Eulerian 
space. In this particular case, unlike in peaks biasing mentioned above, once 
the galaxy held is identified in the initial conditions, its subsequent evolution is 
incorporated by Lagrangian perturbation theory to account for displacement 
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effects clue to the gravitational dynamics. In this case, the tree-level bispectrum 
amplitude becomes [120] 


Qg 


1 b\ 4 b\ A Q 12 P g (fci)P g (fc 2 ) + eye. 

b\ b\ 7b'i P g (k 1 )Pg(k 2 ) + eye. 


(537) 


where A Q V2 = 1 — (k x .k 2 ) 2 / (A; 2 ) 2 , and b\ = l+b^+bf. Note that the last term 
in this expression gives a different prediction than Eq. (528) for the dependence 
of the galaxy bispectrum as a function of triangle configuration that can be 
tested against observations; application to the PSCz survey bispectrum [211] 
suggests that the model in Eq. (528) fits better the observations than Eq. (537). 
Finally, we should also mention that a number of phenomenological (more 
complicated) mappings from dark matter to galaxies have been studied in 
detail in the literature [431,133,474,38]. The results are consistent with expec¬ 
tations based on the simpler models discussed in this section. 


7.1.2 Halo Clustering in the Tree Hierarchical Model 

As mentioned previously the validity of the prescription (524) is subject to 
the assumption that the mass density contrast is small. For biasing at small 
scales this cannot be a valid assumption. Insights into the functional relation 
between the halo held and the matter held then demand for a precise modeling 
of the matter helds. The tree hierarchical model , Eq. (222), has been shown 
to provide a solid ground to undertake such an investigation [41,57]. In these 
papers the connected part of joint density distribution have been computed for 
an arbitrary number of cells, p c (Si,, S p ) and showed to be of the form, 

tp p — 1 

Pc(5i(xi),...,5p(xp)) = J2Qp,a(Si,...,S N ) J2 n 6(x i ,x i ), (538) 

a =1 labelings edges 


with 


Qp,a( s 1, • • •, tip) = n iP(8i)v q (8i) (539) 

where v q (5) is a function of the local density contrast that depends on the 
number q of lines it is connected to in the graph. This form implies for instance 
that 


p(S i, S 2 ) = p(Si)p(S 2 ) [1 + 6(xi, x 2 ) i/i(5i) i/i(5 2 )] • (540) 

At small scales, when the variance is large, the density contrast of dark matter 
halos is much larger than unity, and should be reliably given by a simple 
threshold condition, Si > <5 t hres- Therefore the function zq describes the halo 
bias, and higher-order connected (two-point) joint moments follow directly 
from this bias function and the two-point correlation function of the mass. 
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Fig. 43. Example of a computation of the S 3 , S 4 and S 3 parameters in the tree 
hierarchical model for dark matter halos selected with a varying threshold in x, 
defined by Eq. (541). Calculations have been made with the vertex generating func¬ 
tion, £(t) = (1 — t/k)~ k with k = 1.3. For large values of x one explicitly sees the 
S p —> pP -2 behavior expected in the high threshold limit. 


In this framework a number of important properties and results have been 
derived, 

(i) the correlation functions of the halo population follow a tree structure 
similar to the one of the matter held in the large separation limit (e.g. 
when the distances between the halos are much larger than their size); 

(ii) the values of the vertices depend only on the internal properties of the 
halos, namely on the reduced variable, 

x=^- 2 , (541) 

P<7- 

(iii) all vertices are growing functions of x and have a specific large x asymp¬ 
totic behavior, 


ui(x) =b(x) ~ x (542) 

v p (x) ~ lf(x); (543) 

The large x limit that has been found for the high-threshold clipping limit 
is once again recovered, since we expect in such a model that S^ p —■► jP -2 
when x —> 00 . Property (iii), together with (ii), also holds for halos in the 
framework of the Press-Schechter approach, as we shall see in the next section 
[see discussion below Eq. (556)]. 
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Fig. 44. The functions y>(y), l \y ) and ^ 2 \y) are the generating functions of trees 
with respectively 0, 1 and 2 external lines. For orders above 2 a possible angular 
dependence with the outgoing lines cannot be excluded. 

In addition, it is possible to derive the functions z/ p (a:) in terms of the vertex 
generating function £ (r). These results read, 


v p \x, 


= / dy ^ (p) (y) exp(xy)/ / dy </?(y) exp(xy) 


(544) 


where the function ^ p \y) can be expressed in terms of £ and its derivatives 
(see [57] for details). In case of the minimal tree model where all vertices are 
pure numbers, we have, 


<p{y) = V( (7 + t 7 2 > T /C'(r) = -y; 

(545) 

<P {1) (y) = r(y)\ 

(546) 

<pW(y) = ^ ,(r) ■ 

v yy> l + s/C"(r)’ 

(547) 

^ [V) [l + 2 /C"(r)] 3 ’ 

(548) 


These results provide potentially a complete model for dark matter halo bi¬ 
asing. The explicit dependence of the skewness and kurtosis parameters has 
been computed in these hierarchical models in [57], see Fig. 43. 

Although initially undertaken in the strongly nonlinear regime, these results 
a priori extend to weakly nonlinear scales; that is, to scales where halo sepa¬ 
rations are in the weakly nonlinear regime. Indeed only the tree structure, in 
a quite general sense (see [48,57] for details), is required to get these results. 
In this case the vertex v 2 {x) might bear a non-trivial angular dependence 
originating from the expression of y ), see Fig. 44. There is therefore a 

priori no reason to recover the result in Eq. (528) for the halo bispectrum. 
The connection, if any, with simple relations such as Eq. (524) is thus still to 
be understood. Stochasticity emerging clue to nonlinear effects is in particular 
likely to limit the validity of Eq. (524). 


189 








































































7.1.3 Halo Clustering in the Extended Press-Schechter Approach 
The results obtained in the previous subsection correspond to the correlations 
properties of dense halos detected in a snapshot of the nonlinear density field. 
This approach does not give any insights on the merging history of the halos 
that is likely to be important for the galaxy properties. And because dark 
matter halos are highly non-linear objects, their formation and evolution has 
traditionally been studied using numerical simulations. 

However, a number of analytical models [460,459,119,589], based on the so- 
called Press-Schechter (PS) formalism [531] and extensions [78,95,388,370], 
revealed a good description of the numerical simulation results. 

The PS formalism aims at giving the comoving number density of halos as a 
function of their mass m, 


m 2 n(m ) 
P 



V 2 \ dIn y 
2 ' d In m ’ 


(549) 


where p denotes the average density of the universe, and y = S c /a(m), with 
S c ~ 1.68 the collapse threshold given by the spherical collapse model and 
<r 2 (m) is the variance of the linearly extrapolated density field smoothed at 
scale R = (Sm/Anp) 1 ^ 3 . The average number of halos in a spherical region of 
comoving radius Ro and over-density <5o is 

J\f(m\5o)dm — —-/(<r, 5 c |cr 0 , <5o)w—dm, (550) 

m dm 


where 


f(a,S c \a 0 ,5 0 ) 


1 8 C - <5o 

75 7 2 -<^ exp 


(4 - <y 2 1 

2(a*-a 2 A 


(551) 


is the fraction of the mass in a region of initial radius Ro and linear over¬ 
density <5 0 that is at present in halos of mass m [78,95]. The Lagrangian halo 
density contrast is then [460] 


^h(m\8 0 ) 


n(m)V o 


(552) 


where Vq = 47ri?o/3. When R 0 R so that <j 0 -C a and |<5 q| -C 5 C , this gives 


Sh(m\5 0 ) 


y 2 -1 


(553) 


On the other hand, the Eulcrian halo density contrast is [460] 


8h(m\S 0 ) 


Af(m\5 0 ) 

n{m)V 


(554) 
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where the volume V = 47 rR 3 /3 is related to the initial volume by Rq = 
R( 1 + 5 ) 1 / 3 with 5(5o) = ]Cm=i given by the spherical collapse model. 
When considered as a function of 6 , Eq. (554) gives a bias relation similar to 
Eq. (525) with bias parameters [459] 

bi(m) = 1 + d, b 2 (m) = 2(1 - z/ 2 )ei + e 2 , 

b^m) = 6(^3 — r / 2 )ei + 3(1 — 2z/ 2 )e 2 + 63 , (555) 


with 


6l — 



62 


y 2 (y 2 ~ 3) 
4 2 


63 


y 2 {y 4 - 6 y 2 + 3) 

4 3 


(556) 


This framework has been extended to give halo biasing beyond the spherical 
collapse approximation, in particular [119] discuss the use of the Zel’dovich 
approximation, the frozen-flow approximation and second-order Eulerian PT. 
In addition, [593] study the effects of ellipsoidal collapse on both the mass 
function and the biasing of dark matter halos. They show that tidal effects 
change the threshold condition for collapse to become a function of mass, 
S c (rn), and that the resulting halo bias and mass function are in better agree¬ 
ment with numerical simulations than the PS ones. In particular, less massive 
halos are more strongly clustered than in PS calculations as summarized by 
fitting formulae derived from N-body simulations [350,527], and low (high) 
mass halos are less (more) abundant than predicted in PS [590,343]. 

The higher-order moments for dark matter halos can be calculated from the 
expansion in Eqs. (555) and (526), as first done in [459]. For instance, in the 
rare peak limit b\ ~ y 2 / 8 C 3 > 1 and b 2 ~ b\ so that the three-point function 
obeys the hierarchical model with Q 3 — 1 (or equivalently S 3 = 3). This 
actually extends to any order to give Qn — 1, he. S p = in this limit [459]. 
The fact that dark matter halos are spatially exclusive induces non-trivial fea¬ 
tures on their correlation functions at small scales, which cannot be modeled 
simply as a biasing factor acting on the mass correlation functions. In partic¬ 
ular, the variance becomes significantly less than the Poisson value at small 
scales [460]. A detailed discussion of exclusion effects can be found in [589]. 


7.1.4 Galaxy Clustering 

Since galaxy formation cannot yet be described from first principles, a number 
of prescriptions based on reasonable recipes for approximating the complicated 
physics have been proposed for incorporating galaxy formation into numer¬ 
ical simulations of dark matter gravitational clustering [371,598,134], These 
“semi-analytic galaxy formation” schemes can provide detailed predictions for 
galaxy properties in hierarchical structure formation models, which can then 
be compared with observations. 

The basic assumption in the semi-analytic approach is that the distribution of 
galaxies within halos can be described by a number of simplifying assumptions 
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regarding gas cooling and feedback effects from supernova. For the purposes 
of large-scale structure predictions, the main outcome of this procedure is the 
number of galaxies that populate a halo of a given mass, iV ga i (rn). Typically, 
at large mass ( A gal (rn) ) ~ m a with a < 1 , and below some cutoff mass 
iV ga i (m) = 0. The physical basis for this behavior is that for large masses the 
gas cooling time becomes larger than the Hubble time, so galaxy formation is 
suppressed in large-mass halos (therefore ( N ga \(m) ) increases less rapidly than 
the mass). On the other hand, in small-mass halos effects such as supernova 
winds can blow away the gas from halos, also suppressing galaxy formation. 
A useful analytical model has been recently developed, generally known as 
“the halo model”, which can be easily modified to provide a description of 
galaxy clustering using knowledge of the A gal (rn) relation and the clustering of 
dark matter halos described in Sect. 7.1.3. The starting point is a description 
of the dark matter distribution in terms of halos with masses, profiles and 
correlations consistent with those obtained in numerical simulations. This is 
a particular realization of the formalism first worked out in [552] for general 
distribution of seed masses, although precursors which did not include halo- 
halo correlations were studied long before [477,502,446]. 

Let u m (r) be the profile of dark matter halos of mass m (for example, as 
given in [475,463]), normalized so that / d 3 x'u m (x — x') = 1, and n(m ) be the 
mass function, with / n(m)mdm = p and p the mean background density. The 
power spectrum in this model is written as [587,495,579,419,158,570] 

p 2 P(k ) = (27r) 3 / n(m)m 2 dm|'u m (k)| 2 + (27r) 6 / u mi (k)n(mi)midmi 

x f M, n2 (fc)n(m 2 )m 2 dm 2 P(fc; m 1 ,m 2 ), (557) 


where P(k\m\ ) m 2 ) represents the power spectrum of halos of mass mi and 
m 2 . The first term denotes the power spectrum coming from pairs inside the 
same halo (“ 1 -halo” term), whereas the second contribution comes from pairs 
in different halos (“2-halo” term). Similarly, the bispectrum is given by 


P 3 B V 23 = (27t) 3 / n(m)m 3 dm n? =1 u m (kj) + (27t) 6 / « mi (fci)n(mi)midmi 

x J u m 2 (fe)u m 2 (A;3)n(m2)m 2 dm2P(A; 1 ;m 1 ,m2)Tcyc. 

+ (2vr) 9 (n / u mi (ki)n(mi)m i dm^B 1 23(m 1 ,m2,m 3 ), (558) 


i= 1 ' 


where Pi 2 3 (mi, m 2 , m 3 ) denotes the bispectrum of halos of mass mi, m 2 , m 3 . 
Again, contributions in Eq. (558) can be classified according to the spatial 
location of triplets, from “1-halo” (first term) to “3-halo” (last term). The 
halo-halo correlations, encoded in P(k; mi, m 2 ), - 6123 (^ 1 , m 2 , m 3 ) and so on, 
are described by non-linear PT plus the halo-biasing prescription discussed 
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in Sect. 7.1.3, Eq. (555), plus Eqs. (526-528) with mass correlation functions 
obtained from perturbation theory. 

To describe galaxy clustering, one needs to specify the distribution (mean 
and the higher-order moments) of the number of galaxies which can inhabit 
a halo of mass m. This is an output of the semi-analytic galaxy forma¬ 
tion schemes, e.g. [371,36], or some parameterization can be implemented 
(see e.g. [570,39,37]) which is used to fit the clustering statistics. Assum¬ 
ing that galaxies follow the dark matter profile, the galaxy power spectrum 
reads [579,570] 


n2 g p g(k) = (2tt) 3 / n(m) {N^(m )) dm|u m (k)p 


(2vr) e 


u m (fc)n(m)dmf>i(m) (N gal (m)) 


n 2 


P L (k), 


(559) 


and similarly for the bispectrum, where the mean number density of galaxies 

is 


n, 


= J n(m) (fVg a i(m)) dm. 


(560) 


Thus, knowledge of the number of galaxies per halo moments (APpjpm)) as 
a function of halo mass gives a complete description of the galaxy clustering 
statistics within this framework. Note that in the large-scale limit, the galaxy 
bias parameters reduce to [u m (k) —> 1] 


hi 


1 

n g 


n(m)dm bi{m) (N ga \{m)). 


(561) 


Therefore, in this prescription the large-scale bias parameters are not inde¬ 
pendent, the whole hierarchy of b^s is a result of Eqs. (555) for bi(m) and 
the (Ag a i(m)) relation, which can be described by only a few parameters. 
In addition, the higher-order moments (A^” al (m)) with n > 1, determine the 
small-scale behavior of galaxy correlations; however, relations can be obtained 
between these moments and the mean which, if robust to details 1 ^, means that 
the parametrization of the mean relation is the main ingredient of galaxy bi¬ 
asing. In this sense, this framework promises to be a very powerful way of 
constraining galaxy biasing. 

90 The simplest of such relations assumes Poisson statistics, where {N gSb \(N g& \ — 
1)... (iVg a i — j)) = (N gai y+\ but it is known to fail for low-mass halos which 
have sub-Poisson dispersions [371,36]. A simple fix assumes a binomial distribu¬ 
tion [570], with two free parameters that reproduce the mean and second moment, 
and automatically predict the n > 2 moments. However, it is not known yet how 
well this model does predict the n > 2 moments. Other prescriptions are given 
in [36,39,37]; in particular, [39] study in detail the sensitivity of galaxy clustering 
to the underlying distribution. 
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Fig. 45. The S p parameters for p = 3,4,5 (from bottom to top) for dark mat¬ 
ter (solid) and galaxies (dot-dashed) as a function of smoothing scale R. These 
predictions correspond to those of the halo model, for galaxies they assume that 
(-Wgai) = (m/mo) 0,8 for m > mo = 8 x 10 11 MqH~ 1 , (fV ga i) = (m/mo) for 
m c < m < mo and (-/V ga i) = 0 for m < m c = 4 x 10 9 

The weighing introduced by the {N™ al (m )) on clustering statistics has many 
desirable properties. In particular, the suppression of galaxy formation in high- 
mass halos leads to a galaxy power spectrum that displays power-law-like be¬ 
havior^ [36,579,495,570] and higher-order correlations show smaller ampli¬ 
tudes at small scales than their dark matter counterparts [570] (see Fig. 45), 
as observed in galaxy catalogs. A very important additional consideration is 
that this high-mass suppression also leads to velocity dispersion of galaxies in 
agreement with galaxy surveys such as LCRS [349]. 

7.2 Projection Ejfects 

This subsection is devoted to the particular case of angular surveys. These 
surveys constitute a large part of the available data and allow to probe the 
statistical properties of the cosmic density field at large scales, as we shall 
discuss in the next chapter, and furthermore they do not suffer from redshift- 
space distortions. Although they do not really probe new aspects of gravita¬ 
tional dynamics, the filtering scheme deserves a specific treatment. It is also 
worth noting here, as we shall briefly discuss in the next section, that this 
filtering directly applies to weak lensing observations that are now emerging, 
see e.g. [453] for a review. 

91 In addition, note that a power-law behavior has also been obtained in numerical 
simulations by selecting ‘galaxies’ as halos of specific circular velocities [141]. 
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In the following we first review the general aspects of projection effects, and 
quickly turn to the widely used small-angle approximation, where most ap¬ 
plications have been done. We then show how the three dimensional (3D) 
hierarchical model projects into a two dimensional (2D) hierarchy, where the 
3D and 2D hierarchical coefficients are simply related. In Sects. 7.2.4 and 7.2.5 
we go beyond the hierarchical assumption to present predictions for the pro¬ 
jected density in PT. Finally, in Sect. 7.2.6, we discuss the reconstruction of 
the one-point PDF of the projected density. 


7.2.1 The Projected Density Contrast 

Let us describe the comoving position x in terms of the radial distance y and 
angular distance V so that x = (y, VQ)^ 7 j. The radial distance is defined hyp 777 ] 


dy 


cdz/H 0 

+ (1 — — Da)(1 + z) 2 + D m (l + z) 3 


(562) 


with H 0 Hubble’s constant^ and c the speed of light, while the angular dis¬ 
tance is defined by, 

2%) = n == ff= sinh (yjl - Q m - Q A . (563) 

V1 - l2 m - V c / 

In general, for angular surveys, the measured density contrast of galaxy counts 
at angular direction 6 is related to the 3D density contrast through, 

<W0) = / d X X 2 V’(x) <Wx > V e ) ( 564 ) 

where V’(x) is th e selection function (normalized such that / dyy 2 ^(y) = 1); 
it is the normalized probability that a point (galaxy) at a distance y is included 
in the catalog. 

In practice the depth of the projection is finite due to the rapid decrease of the 
selection function ip(x) with y at finite distance. The selection function V’(x) 
for a sample limited by apparent magnitudes between rri\ and m 2 is typically 
given by, 


92(c) 

j d qj)*q a e~\ q^V) = i = 1,2 (565) 

91(C) 


92 See cosmology textbooks, e.g. [511], or the pedagogical summary in [316] for a 
detailed presentation of these aspects. 

93 Note that the Q parameters refer here to those evaluated at redshift z = 0. 

94 Throughout this work we use H$ = 100 h km/sec/Mpc. 
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with 


Mi(V) — rn t — 5 log 10 V(1 + z) - 25, (566) 

where is a normalization constant and <fi(q) = (j)* q a e~ q is the luminosity 
function, i.e. the number density of galaxies of a given luminosity. M* and 
a might be expressed as a function of redshift z to account for k-corrections 
and luminosity evolution. When redshift information is available, one can also 
rewrite the selection function in terms of the sample redshift number counts 
N(z) alone. 


7.2.2 The Small-Angle Approximation 

The cumulants of the projected density can obviously be related to those of 
the 3D density fields. Formally they correspond to the ones of the 3D held 
filtered by a conical-shaped window. From Eq. (564) we obtain: 

N 

II d XiX 2 i '&(Xi){5(Xi,'DiO i) • ■ ■&(Xn,'D n Qn))c- (567) 

2=1 


wn{9 i , •••, On) — J 


The computation of such quantities can be easily carried out in the small angle 
approximation. Such approximation is valid when the transverse distances 
V\6i — 0\ are much smaller than the radial distances x%- I n this case the integral 
(567) is dominated by configurations where \i ~ Xj ~ D,1 9, —0j \ ~ \ 9, —0j\. 

This allows to make the change of variables \i D with Xi = Xi+ r i'Di(0i—0\). 
Then, since the correlation length (beyond which the multi-point correlation 
functions are negligible) is much smaller than the Hubble scale cjH{z) (where 
H(z) is the Hubble constant at redshift z) the integral over r t converges over 
a small distance of the order of T>i\6i — 9i\ and the expression (567) can be 
simplified to read 


w N (0i ,..., 0 N ) — J d \ i \'f x P v 1 ^{xi) N 


x 



(Oi - OJdn Zn [( xi , -DiOi), • • • , {xn, VJn)] 


(568) 


This equation constitutes the small-angle approximation for the correlation 
functions. If these behave as power-laws, Eq. (568) can be further simplified. 
For instance, the two-point function is then given by the Limber equation [402], 


OO 

W2 (#) = # 1_7 ro J dx X 4 'D 11 ' l i’ 2 (x) J dr (1 + r 2 ) -7 ^ 2 , (569) 

— OO 


196 



if the 3D correlation function is £ 2 ( 7 ") = (^/^o) -7 - The fact that the last 
integral that appears in this expression converges^] justifies the use of the 
small-angle approximation. It means that the projected correlation functions 
are dominated by intrinsic 3D structures, that is, the major contributions come 
from finite values of r* which corresponds to points that are close together in 
3D space. 

The small-angle approximation seems to be an excellent approximation both 
for w 2 and for u> 3 up to 6 ~ 2 deg. This can be easily checked by numerical 
integration of a given model for £ 2 and £ 3 , see e.g. [508,48,254], 

An equivalent way of looking at the small-angle approximation is to write the 
corresponding relations in Fourier space. The angular two-point correlation 
function can be written in terms of the 3D power spectrum as [364], 

w 2 (9) = 2 -it f dyxV 2 (%) J d 2 k ± P(k±) e lVk±e . (570) 


The expression (570) shows that in Fourier space the small angle approxima¬ 
tion consists in neglecting the radial component of k (to be of the order of the 
inverse of the depth of the survey) compared to k^ (of the order of the inverse 
of the transverse size of the survey). Thus, in the small-angle approximation, 
the power spectrum of the projected density held is, 


P2D(l) 



(571) 


This can be easily generalized to higher-order correlations in Fourier space, 


(82d(Ii) ■ ■ ■ S 2 d(In))c 


(2n) N 1 S D (h + ... + In) 


x 



x 2N rtx) 

J)2N-2 


Pn 


fh M 


(572) 


Note that the Fourier-space expression given above assumes in fact not only 
the small-angle approximation, but also the hat-sky approximation which ne¬ 
glects the curvature of the celestial sphere. General expressions for the power 
spectrum and higher-order correlations beyond the small-angle (and hat-sky) 
approximation can be derived from Eq. (567) by Legendre transforms, see 
e.g. [242,671], 


7.2.3 Projection in the Hierarchical Model 

The inversion of Eq. (568), to relate £jv in terms of wn is still not trivial in 
general because the projection effects mix different scales. As in the case of 
the two-point correlation function, i.e. Limber’s equation, it is much easier 
to obtain a simple relation between 3D and 2D statistics for models of £at 

95 It is given by dr (1 + r 2 ) -7 / 2 = v / Tr( ~ 1 2 + 7 )/r(^), which converges for 7 > 1. 
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that have simple scale dependence. In the Hierarchical model introduced in 
Sect. 4.5.5, 


t n N—l 

6v( r i, •••, rjv) = E Qn,u e n 60a, r B ), 

a= 1 labelings edges 


(573) 


and, remarkably, it follows that the projected angular correlations obey a 
similar relation: 


t N N -1 

Wn(0i, ..., 0 N ) = ^ gAT, a e n w 2 (9a, 0 b ) 

a= 1 labelings edges 


(574) 


where (pv, a is simply proportional to Qjv, a - Moreover the relation between q^, a 
and Qn,<x depends only on the order N and is independent on the particular 
tree topology. To express q n in terms of Q N we can use a power-law model for 
the two-point correlation: 6 ( r ) = ( r /' r o) -7 - For small angles, we thus have, 


Qn = r nQni 

jN—2j n 

r N = 1 N _ x with, I k 


(575) 

oo 

I d X x 2k v k - 1 i} k (x)v-^ k - 1) (i + 6 ' 3(fc_1) 

o 


where we have taken into account of redshift evolution of the two-point cor¬ 
relation function in the non-linear regime assuming stable clustering (see 
Sect. 4.5.2), 6 ( r , z ) — 6 (r) (1 + z )^ 3 . The integrals I k are just numerical 
values that depend on the selection function and 7 . The values of if;* and (j)* 
in Eq. (565) are thus irrelevant for q jy. The only relevant parameters in the 
luminosity function are M* and a. 

The resulting values of r^ increase with 7 and M* and decrease with a , but 
do not change much within the uncertainties in the shape of the luminosity 
function (see §56 in [508], and [249]). This is illustrated in Table 13 where 
values of r,v are plotted for different parameters in the selection function. In 
the analysis of the APM, variations of 7 are only important for very large 
scales, 9 > 3°, where 7 changes from 1.8 to 3. In this case r N displays a 
considerable variation and Eq. (575) is not a good approximation. 

As an example we can consider the selection function given by the character¬ 
istic “bell shape” in a magnitude limited sample: 

-0(r) oc r~ b exp [—r 2 /V 2 ], (576) 


where V is related to the effective sample depth and b is typically a small 
number (e.g. for the APM b ~ 0.1 and T> ~ 350Mpc/h). For this selection 
function and a power-law P(k) oc k n (e.g. 7 = — (n + 3)) we can calculate r 3 
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Table 13 

Projection factors for different slopes 7 and parameters Mg and ao in the luminosity 
function._ 


7 



^3 

r 4 


re 

r 7 

^8 

rg 

1.7 

-19.8 

-1.0 

1.19 

1.52 

2.00 

2.71 

3.72 

5.17 

7.25 

1.7 

-19.3 

-1.2 

1.21 

1.57 

2.12 

2.93 

4.13 

5.88 

8.44 

1.7 

-20.3 

-0.8 

1.18 

1.48 

1.93 

2.56 

3.46 

4.73 

6.51 

1.8 

-19.8 

-1.0 

1.20 

1.55 

2.08 

2.85 

3.98 

5.62 

8.00 

3.0 

-19.8 

-1.0 

1.54 

2.85 

5.78 

12.4 

27.8 

63.9 

150 


explicitly, 

= JL (Vm l r[3/2 - 6/2]r[3/2 - n - 3/2 6] /3\” 

3 iV 3 \ 4, ) r[3/2 - n/2 - 6] 2 V2 ) ’ 

For 6 = 0 and n — 0 we find r 3 = ~ 1.54, while for 6 = 0 and n = —1, 

closer to the APM case, r 3 = ~ 1.21, comparable to the values given 

in [249]. 

It is important to notice that although tn are unaffected by changes in ip, 
the overall normalization of Ik can change significantly. Because of this, while 
the amplitude of £2 is uncertain by 40% for AM* = 1.0 and A a = 0.4 the 
corresponding uncertainty in r% is only 2%. This is an excellent motivation for 
using the hierarchical ratios qjy as measures of clustering. 

Note that the above hierarchical prediction could only provide a good approxi¬ 
mation to clustering observations at small scales, where the hierarchical model 
in Eq. (573) might be a good approximation (see Sects. 4.5.5 and 8.2.4). On 
larger scales, accurate predictions require projection using the PT hierarchy, 
which is different from Eq. (573), as the N-point correlation functions have a 
significant shape dependence (see Sect. 4.1). Despite this ambiguity on how to 
compare angular observations to theoretical predictions, note that these two 
approaches give results that agree within 20% (e.g. see Fig. 47 below). 

7.2.4 The Correlation Hierarchy for the Projected Density 

We can define the area-averaged angular correlations u p {9) in terms of the 

angular correlation functions wn(9i, ..., 9 N ): 

v P (9) = -^ J d%i... dA p w p (9i ,..., 9 P ) = {8% D (9)) C , (578) 

A 

where A = 27t(1 — cos 9) is the solid angle of the cone, d A p = sin 9 p d9 p dtp p and 
62 d(9) is the density contrast inside the cone. Thus uJ p (9) only depends on the 
size of the cone, 9 , analogous to smoothed moments in the 3D case. The use 
of Eq. (568) leads to, 
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(579) 


u p (6) 




d U [(xi, £>i0i),..., (x P , V x O p )]. 


On can see that the cumulants of the projected density are thus line-of-sight 
averages of the density cumulants in cylindrical window function, 

= jixx*r(x) (580) 

where 5^ g cyl is the filtered 3D density with a cylindrical filter of transverse 
size T> 9 and depth L. For instance, written in terms of the power spectrum, 
the second moment reads, 

uJ 2 (6) = 2i: f dxx 4 '4’ 2 (x) J d 2 kj_ P(k A _)W 2 D {V9 k±) (581) 


where W 2 D is the top-hat 2D window function, 

w 2D (ie) = 2^Xi. 


(582) 


The relation (580) shows that the cumulant hierarchy is preserved. If we define 
the s p parameters in angular space, 


s p {6) 


QJpjO) 

p2(P)]*- V 


(583) 


it follows that they are all finite and independent of L. 

I 11 the weakly nonlinear regime, we can compute exactly the hierarchy for the 
projected density because the density cumulants for a cylindrical window are 
those obtained for the 2D dynamics (see Sect. 5.9). In case of a power-law 
spectrum the s p are independent of the filtering scale. The line-of-sight inte¬ 
grations can then be performed explicitly^. Using Eq. (580) and the results 
of Sect. 5.9, givesp 77 ] 


o — r Q 
'p u p 

-2 ; 


•2D 


T p ~ 2 T 

1 l 1 l 


r P=-p=T, with 


h = 


J d X X 2k fp k (x ) P - ( n+3 )( fc - 1 ) Dl k ~ 2 (z). 

0 


(584) 

(585) 


96 For CDM models a semianalytic result can be obtained for the skewness, see [524] 
for details. 

97 It is important to note that in Eq. (584) the coefficients S p D need to be used and 
not those corresponding to 3D top-hat filtering as suggested by the tree hierarchical 
model. 
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Note that the r p coefficients are very similar to those in the nonlinear case 
except that the redshift evolution of the fluctuation is taken here to be given 
by the linear growth rate. This is actually relevant only when the redshift 
under consideration is comparable to unity. 

An interesting point is that it may seem inconsistent to use both tree-level 
PT predictions and the small-angle approximation, as a priori it is not clear 
whether their regimes of validity overlap. As shown in [254] for characteristic 
depths comparable to APM there is at least a factor of five in scale where both 
approximations are consistent, depending on the 3D power spectrum shape. 
For deeper surveys, of course, the consistency range is increased, so this is a 
meaningful approach. 

As expected, similar results hold for the hierarchy of correlation functions in 
the weakly non-linear regime. The results for the angular three-point function 
and bispectrum have been studied with most detail [242,225,101,671]. From 
Eqs. (571-572) and for power-law spectra, it follows that the configuration 
dependence of the bispectrum is conserved by projection, only the amplitude 
is changed by the projection factor r 3 , as in Eq. (585) [275,242,225,101]. How¬ 
ever, as soon as the spectral index changes significantly on scales comparable 
to those sampled by the selection function, this simple result does not hold 
anymore [242], A number of additional results regarding the shape dependence 
of projected correlations include, i) a study of the dependence on configura¬ 
tion shape as a function of depth [81], that also includes redshift-dependent 
galaxy biasing; ii) the power of angular surveys to determine bias parameters 
from the projected bispectrum in spherical harmonics [671], and iii) compar¬ 
isons of PT predictions and numerical simulations in angular space [225], as 
we summarize in the next section. 

7.2.5 Comparison with Numerical Simulations 

We now illustrate the results described in the previous section and compare 
their regime of validity against numerical simulations. 

Figure 46 shows the angular three-point correlation function for APM-like and 
SCDM spectrum projected to the depth of the APM survey, see [225] for more 
details. As discussed before, the configuration dependence of the three-point 
amplitude is quite sensitive to the shape of the power spectrum. Both the shape 
and amplitude of q 3 (a) predicted by PT (solid curves) are reproduced by the 
N-body results (points) even on these moderately small scales^ 8 ]. The error 
bars in the simulation results are estimated from the variance between 5 maps 
from different N-body realizations and have been scaled to 1-a uncertainties 
for a single observer. The dashed lines correspond to the results of the 3D 
Q 3 for ri = r 2 = 15 h _1 Mpc multiplied by the hierarchical projection factor 
in Eq. (575), e.g. q 3 = Q 3 r 3 . The model seems to work well for small a, but 
there are significant deviations for large a, which illustrate that this projection 


98 At the mean depth of the APM, two degrees corresponds to ~ 15 h 1 Mpc. 
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Fig. 46. Projected leading order PT predictions (solid curves) and N-body results 
(points with sampling errors) for the angular 3-point amplitude 93 (a) at fixed 
012 = 0 i 3 = 2 deg for a survey with the APM selection function. N-body results 
correspond to the average and variance of 5 realizations of the APM-like model 
(top) and the SCDM model (bottom). The dashed lines show the corresponding PT 
predictions for ri 2 = 7-13 = 15 Mpc/h projected with the hierarchical model. 

model does not work well, as discussed above. 

In the weakly nonlinear regime the third moment of smoothed angular fluc¬ 
tuations, defined in (579), can be explicitly written in terms of the power 
spectrum using PT. It is given by, 


u 3 = 6(2it) 2 j d\ xV 3 (x) 


^ (/ kdkW% D (kV0)P{k) 2 


(586) 


1 

"2 


kdkW 2D (kV6)P(k ) J k 2 dkV6W 2D {kV d)W 2D (kV 0 ) P{k) 


where W 2D is the derivative of the top -hat window W 2 d defined in Eq. (582). 
Therefore, in case of a power-law spectrum P(k) ~ k n , we have [48], 


S3 = r 3 



(587) 


with t '3 given in general by Eq. (585). The coefficient r 3 is found in practice 
to be of order unity and to be very weakly dependent on the adopted shape 
for the selection function. 

It is worth to note that the hierarchical model in Sect. 7.2.3 yields a different 
prediction for S 3 than the above tree-level value. In the hierarchical case, 
s 3 ~ r 3 S 3 ([249,250]) with S 3 = 34/7 — (n + 3). For example, for n ~ —1, 
the hierarchical model yields S 3 ~ 3.43 while the tree level prediction yields: 
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Fig. 47. Tree-level PT predictions for the APM-like power spectrum (solid curves) 
and corresponding N-body results (points with sampling errors) for the projected 
smoothed skewness s^(6) as a function of the radius 6 (in degrees) of the cells in 
the sky. The short and long dashed line show the hierarchical prediction S 3 ~ r^S^, 
see text for details. 

S 3 ~ 4.38. This difference becomes smaller as we move towards larger n (e.g. 
larger scales), being zero at n = 4/7, but it is significant for the range of 
scales probed with current observations, even after taking uncertainties into 
account. 

Figure 47 compares the predictions for the angular skewness S 3 by tree-level 
PT (solid lines) for a power spectrum that matches the APM catalogue and 
the APM measurements (triangles). These predictions correspond to a numer¬ 
ical integration of PT predictions in Eq. (587) [254], The dashed lines show 
the “naive” hierarchical prediction S 3 ~ r^Sn at the angular scale 6 ~ R/V 
given by the depth, V, of the survey. The long dashed line uses a fixed value 
of r 3 = 1.2, while the dashed line corresponds to r 3 = r 3 (n) given by Eq. (577) 
with n = —(3 + 7 ) given by the logarithmic slope of the variance of the APM- 
like P(k) at the angular scale 6 ~ R/V. These results are compared with the 
mean of 20 all sky simulations described in [254] (error bars correspond to 
the variance in 20 observations). As can be seen in the figure, the hierarchi¬ 
cal model gives a poor approximation, while the projected tree-level results 
matches well the simulations for scales 6 > 1 deg, which correspond to the 
weakly nonlinear regime where £2 < 1- On small scales the discrepancies be¬ 
tween the tree-level results and the simulations is due to 3D non-linear effects 
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but also to projection: on small scales the simulations follow the hierarchical 
model in Eq. (573), rather than the PT predictions, and therefore S 3 ~ r 3 S 3 
gives a good approximation, but S 3 should be the non-linear 3D value (for 
example, as given by HEPT or EPT, see Sects. 4.5.6 and 5.13, respectively). 
Further comparisons with numerical simulations for S 3 and S 4 are presented 
below in Fig. 54 together with observational results. 

7.2.6 Reconstructing the PDF of the Projected Density 
It is interesting to note that it is possible to write down a functional rela¬ 
tion between the cumulant generating function defined in Eq. (141) for the 
projected density, <^ pr 0 j(l/), and the one corresponding to cylindrical filtered 
cumulants, <^ cy i (y) [659,468,58]. This can be done from the relation (580) which 
straightforwardly leads to, 

TVoj (y) = J ^^y^cyi [yx 2 i>(x)Z e (x)}, (588) 


with 


^e(x) 


(^ve,cy\ 

“W 


) 


L, 


(589) 


which can be rewritten in terms of the matter fluctuation power spectrum, 


fd 2 kP(k,z)W 2 (kV6) 

“ / d X ' X' 4 ^ 2 (x’) I d 2 k P(k, z>) W 2 [k V 9 )' 


(590) 


In this expression we have explicitly written the redshift dependence of the 
power spectrum. In the case of a power-law spectrum, 


P(M = PoU) 



(591) 


it takes a much simpler form given by, 


^e(x) 


Po{z ) V~ n ~ 2 

fd X 'x ,4 r(x')Po(z')V 2 


(592) 


Together with Eq. (588) this result provides the necessary ingredients to re¬ 
construct the one-point PDF of the projected density with an inverse Laplace 
transform of <p pTO j(y). Note that projection effects alter the shape of the singu¬ 
larity in (p(y) though it preserves the large density exponential cutoff [659,58]. 
So far this approach has only been used in the literature to study the recon¬ 
struction of the one-point PDF of the local convergence held in the context of 
weak lensing observations [659,468]. 

We now turn to a brief summary of the basics of weak lensing and its connec¬ 
tions to projection effects. 


204 



1.3 Weak Gravitational Lensing 


The first theoretical investigations on the possibility of mapping the large- 
scale structure of the universe with weak gravitational lensing dates back to 
the early nineties [69,70,456,364]. It was then shown that the number of back¬ 
ground galaxies was large enough to serve as tracers of the deformation field 
induced by the intervening large scales structures. In this context the obser¬ 
vation of a coherent shear pattern in the orientation of background galaxies is 
interpreted as due to lensing effects caused by the mass concentration along 
the line of sight. The potential interest of such observations has led to further 
theoretical investigations such as the determination of the dependence of the 
results on cosmological parameters [676,53,339,665], and to extensive obser¬ 
vational efforts. The latter have recently led to the first reliable detections of 
the so-called “cosmic shear” [666,11,693,365]. 

Although in nature totally different from galaxy counts, it is worth pointing 
out that such observations eventually aim at mapping the line-of-sight mass 
fluctuations so that techniques developed for studying galaxy angular cata¬ 
logues can be applied. Here we briefly introduce the physics of lensing with 
emphasis in connections to angular clustering. More comprehensive presenta¬ 
tions can be found in [25,453]. 


7.3.1 The Convergence Field as a Projected Mass Map 

The physical mechanism at play in weak lensing surveys is the deflection of 
photon paths in gravitational potential fields. The deflection angle per unit 
distance, 5a/Ss, can be obtained from simple computations of the geodesic 
equation in the weak field limitp^]. When the metric fluctuations are purely 
scalar, the deflection angle reads 


5a 

5s 


-2V X ^, 


(593) 


where the spatial derivative is taken in a plane that is orthogonal to the photon 
trajectory. 

The direct consequence of this bending is a displacement of the apparent 
position of the background objects. This depends on the distance of the source 
plane, Dos , and on the distance between the lens plane and the source plane 
Dls- It is given by, 


7 " 



2 Dls 
c 2 D os D ol 




ds <j>(s, 7 ) 


(594) 


where y 7 is the position in the image plane and 7 s is the position in the source 
plane. The gradient is taken here with respect to the angular position (this is 

99 See e.g. [457,544] for a comprehensive presentation of these calculations. 
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why a Dol factor, distance to the lens plane, appears). The total deflection 
is obtained by an integration along the line-of-sight, assuming the lens is thin 
compared to its angular distance. Calculations are also usually done in the 
so-called Born approximation for which the potential is computed along the 
unperturbed photon trajectory. 

The observable effect which is aimed at, however, is the induced deformation of 
background objects. Such an effect is due to the variations of the displacement 
held with respect to the apparent position. These variations induce a change 
in both size and shape of the background objects which are encoded in the 
amplification matrix, A, describing the linear change between the source plane 
and the image plane, 


A = 



(595) 


Its inverse, A -1 , is actually directly calculable in terms of the gravitational 
potential. It is given by the derivatives of the displacement with respect to 
the apparent position, 


^r 1 



J 13 


Dls 

71 71 ^ ’t?' 

JJoS L>OL 


(596) 


where ip is the projected potential. Usually its components are written, 


^r 1 


' 1 - k - 7i -72 

^ -72 1 - k T 7i 


(597) 


taking advantage of the fact that it is a symmetric matrix. The components of 
this matrix are expressed in terms of the convergence, k, (a scalar field) and 
the shear, 7 (a pseudo vector field). 

The key idea for weak lensing observations is then that collection of tiny 
deformation of background galaxies can be used to measure the local shear 
field from which the projected potential, and therefore the convergence field, 
can be reconstructed [364]. The latter has a simple cosmological interpretation: 
from the trace of Eq. (596) one obtains the convergence ! 100 1 , 

«( 7 ) = 7 \^m j dz s n(z s ) J dx 7 ) (1 + z), (598) 

as the integrated line-of-sight density contrast. In Eq. (598) x is the distance 
along the line-of-sight and T> are the angular distances. In this relation sources 
are assumed to be located at various redshifts with a distribution n(z s ) nor¬ 
malized to unity, and all the distances are expressed in units of c/H$. The 

100 In these sections, Q m is understood to be at z = 0. 
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relation (598) is then entirely dimensionless. Note that in general the relation 
between the redshift and the distances depends on cosmological parameters, 
see Eq. (563). 

7.3.2 Statistical Properties 

To gain insight into the expected statistical properties of the convergence 
held, it is important to keep in mind that in Eq. (598) the convergence k is 
not normalized as would be the local projected density contrast. The projected 
density contrast is actually given by 

M7) = = (599) 

u 

where uJ is the mean lens efficiency, 

u = j C lz 8 n(z s ) J dx (! + «)■ ( 600 ) 

This implies that the skewness of the convergence held is then given by, 

s P r °j 

= -4- (601) 

u 

where S 3 roj is the skewness of the projected density contrast given by Eq. (587). 
As a consequence the skewness of k is expected to display a strong Q m depen¬ 
dence. This property has indeed been found in [53] where it has been shown 
using PT that 

s£ « 40 ft" 0 - 75 (602) 

for sources at redshift unity _^]. This result has been subsequently extended 
to the nonlinear regime [339,326,467,469,666,158], higher-order moments, the 
bispectrum [158], and to more complex quantities such as the shape of the 
one-point PDF of the convergence held [658,659,468]. 

7.3.3 Next to Leading Order Effects 

Contrary to the previous cases, corrections to the previous leading order PT 
results, e.g. Eq. (602), do not involve only next-to-leading order terms due to 
the nonlinear dynamics but also other couplings that appear specifically in the 
weak lensing context. Let us list and comment the most significant of them: 
(i) An exact integration of the lens equations leads to lens-lens coupling and 
departures from the Born approximation. This induces extra couplings that 
have been found to be in all cases negligible for a source population at 
redshift of about unity [53,667]. 

101 For the same reasons that S 3 has a strong Q m dependence, it also depends 
significantly on the source redshift distribution. 
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(ii) The source population clustering properties can also induce non-trivial ef¬ 
fects as described in [55]. This is due to the fact that the source plane is by 
itself a random media which introduces further couplings due to either in¬ 
trinsic galaxy number fluctuations or due to overlapping of lens and source 
populations. These effects have been found to be small if the redshift dis¬ 
tribution of the sources is narrow enough [55,284] which might indeed put 
severe constraints on the observations. 

(iii) The magnification effect (when k is large, galaxies are enlarged and can thus 
be more efficiently detected) could also induce extra couplings. Although it 
is difficult to estimate the extent of such an effect, it appears to have only 
modest effects on the high-order statistical properties of the convergence 
field [285]. 

Finally it is important to note that the first reports of cosmic shear detections 
have been challenged by suggestions that part of the signal at small scales 
might be due to intrinsic galaxy shape correlations [308,166,121], This is a 
point that should be clarified by further investigations. 


7.3.4 Biasing from Weak Gravitational Lensing 

With the arrival of wide surveys dedicated to weak lensing observation ^ wz |, a 
very powerful new window to large-scale structure properties is being opened. 
Weak lensing observations can indeed be used not only to get statistical prop¬ 
erties of the matter density field, but also to map the mass distribution in 
the Universe. In particular it becomes possible to explore the galaxy-mass lo¬ 
cal relation [664], Potentially, galaxy formation models, biasing models, can 
be directly tested by these observations. It is indeed possible to measure the 
correlation coefficient r K of the convergence field k with the projected density 
contrast of the (foreground) galaxy <5 9)2 d, 

_ ( K ^g,2D) 

\/(« 2 > (<%2 d ) ’ 

a quantity which, within geometrical factors, is proportional to the r coeffi¬ 
cient defined in Eq. (531). What has been measured so far [315] is however 
bg,2D) / ■> W a Quantity that roughly scales like Q m r/b. Pioneer¬ 

ing results suggest a scale independent r/b parameter of about unity for the 
favored cosmological model (Q m = 0.3, Ha = 0.7) [315]. Such observations are 
bound to become common place in the coming years and will provide valuable 
tests for galaxy formation models. 


(603) 


102 


See for example, |http:/ /terapix.iap.fr/Descart/ 
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7-4 Redshift Distortions 


In order to probe the three-dimensional distribution of galaxies in the Uni¬ 
verse, galaxy redshifts are routinely used as an indicator of radial distance 
from the observer, supplemented by the two-dimensional angular position on 
the sky. The Hubble expansion law tells us that the recession velocity of an 
object is proportional to its distance from us; however, the observed velocity 
has also a contribution from peculiar velocities, which are generated due to 
the dynamics of clustering and are unrelated to the Hubble expansion and 
thus contaminate the distance information. Therefore, the clustering pattern 
in “redshift space” is somewhat different than the actual real space distribu¬ 
tion. This is generically known as “redshift distortions”. 

At large scales, the main effect of peculiar velocities is due to galaxies infall 
into clusters. Galaxies between us and the cluster have their infall velocities 
added to the Hubble flow and thus appear farther away in redshift space, 
whereas those galaxies falling into the cluster from the far side have their 
peculiar velocities subtracting from the Hubble flow, and thus appear closer 
to us than in real space. As a consequence of this, large-scale structures in 
redshift space appear flattened or “squashed” along the line of sight. On the 
other hand, at small scales (smaller than the typical cluster size) the main 
effect of peculiar velocities is due to the velocity dispersion from virialization. 
This causes an elongation along the line of sight of structures in redshift space 
relative to those in real space, the so-called “finger of God” effect (which points 
to the observer’s location). 

7-4-1 The Density Field in Redshift Space 

We now discuss the effects of redshift distortions on clustering statistics such 
as the power spectrum, the bispectrum and higher-order moments of the 
smoothed density field. See the exhaustive review [295] for details on theo¬ 
retical description of linear redshift distortions and observational results. In 
redshift space, the radial coordinate s of a galaxy is given by its observed 
radial velocity, a combination of its Hubble flow plus “distortions” due to 
peculiar velocities. Here we restrict to the “plane-parallel” approximation, so 
that the line of sight is taken as a fixed direction, denoted by z. Plane-parallel 
distortions maintain statistical homogeneity, so Fourier modes are still the 
natural basis in redshift-space. On the other hand, statistical isotropy is now 
broken, because clustering along the line of sight is different from that in the 
perpendicular directions. 

However, when the radial character of redshift distortions is taken into ac¬ 
count, the picture changes. Radial distortions respect statistical isotropy (about 
the observer), but break statistical homogeneity (since there is a preferred loca¬ 
tion, the observer’s position). In this case Fourier modes are no longer special, 
in particular, the power spectrum is no longer diagonal [703]. Alternative ap¬ 
proaches to Fourier modes have been suggested in the literature [306,292,616], 
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here we review the simplest case of plane-parallel distortions where most of 
the results have been obtained. We should note that this is not just of aca¬ 
demic interest, it has been checked with N-body simulations that results on 
monopole averages of different statistics carry over to the radial case with very 
small corrections [566]. 

The mapping from real-space position x to redshift space in the plane-parallel 
approximation is given by: 

s = x-fv z (x)z, (604) 

where f(Q m ) « is the logarithmic growth rate of linear perturbations, and 
u(x) = —u(x)/(7d/), where u(x) is the peculiar velocity field, and 7d(r) = 
(1 /a) (da/dr) = Ha is the conformal Hubble parameter (with FRW scale fac¬ 
tor a(r) and conformal time r). The density field in redshift space, J s (s), is 
obtained from the real-space density field <5(x) by requiring that the redshift- 
space mapping conserves mass, i.e. 

(1 + A,)d 3 s = (1 + J)d 3 x . (605) 


Using the fact that d 3 s = J(x)d 3 x, where J(x) — |1 — fV z v z (x)\ is the exact 
Jacobian of the mapping in the plane-parallel approximation, it yields 


S a (s) 


S(x) + 1 — J(x) 
J(x) 


(606) 


The zeros of the Jacobian describe caustics in redshift space, the locus of 
points where the density field is apparently infinite [450]. This surface is char¬ 
acterized in real space by those points which are undergoing turn-around in 
the gravitational collapse process, so their peculiar velocities exactly cancel 
the differential Hubble flow. In practice, caustics are smoothed out by sub¬ 
clustering, see e.g. the discussion in [330]. 

An expression for density contrast in redshift space follows from Eq. (606) [562] 


S s (k) 


d 3 x 

( 2^)3 


e -ikx e ifk z V z (pc) 


S(x) + fV z v z (x) , 


(607) 


where we assumed here that only points where fV z v z (x) < 1 contribute. The 
only other approximation in this expression is the use of the plane-parallel ap¬ 
proximation, i.e. this is a fully non-linear expression. To obtain a perturbative 
expansion, we expand the second exponential in power series, 


w= £ d 3 k 1 ...d 3 k n [J D ]Jj(k 1 ) + /^(k 1 ) 


n= 1 ' 


(f/*k) 


n—1 


(n — 1)! 




(608) 
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where [<5o]n = <5o(k — ki — • • ■ — k„), the velocity divergence 6*(x) = V • v(x), 
and Hi = kj • z/ki is the cosine of the angle between the line-of-sight and the 
wave-vector. In linear PT, only the n = 1 term survives, and we recover the 
well-known formula due to Kaiser [362] 

5 s (k) =5(k)(l + //r 2 ). (609) 

Equation (608) can be used to obtain the redshift-space density held beyond 
linear theory. In redshift space we can write 

OO r. « 

S s (k, r) = J2 D™(t) / d 3 k !... d 3 k„ ,[<S D ] n Z n (k h ..., k n ) 5i( ki) • • • cb(k n ), 

n=1 

(610) 


where D\ (r) is the density perturbation growth factor in linear theory, and we 
have assumed that the n th -order growth factor D n oc D ”, which is an excellent 
approximation (see [560], Appendix B.3). Since a local deterministic and non¬ 
linear bias can be treated in an equal footing as non-linear dynamics, it is 
possible to obtain the kernels Z n including biasing and redshift-distortions. 
From Eqs. (525) and (608)-(610), the redshift-space kernels Z n for the galaxy 
density held read [669,562] [ T ^’ T | 


Zi(k) 
Z 2 (ki, k 2 ) 

^(ki, k 2 , k 3 ) 


(&i + /h 2 )) 

biF 2 (ki, k 2 ) + ffi 2 G 2 ( ki, k 2 
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(611) 

(612) 


+flxk(b 1 + fu. 2 1 )^G%'(k 2 M) 

k 2 3 

+3& 2 d‘Ai. k 2 ) + h 

2 rC 2 /C3 U 


i( s ) 


(613) 


where we denote /r = k • z/k, with k = k 3 + ... + k„, and p* = k, • z/ki. As 
above, F 2 and G 2 denote the second-order kernels for the real-space density 
and velocity-divergence helds, and similarly for F 3 and G 3 . Note that the 
third order kernel Z 3 must still be symmetrized over its arguments. One can 
similarly obtain the PT kernels Z n in redshift space to arbitrary higher order. 
We note that there are two approximations involved in this procedure: one 
is the mathematical step of going from Eq. (607) to Eq. (608), which ap¬ 
proximates the redshift-space mapping with a power series; the other is the 


103 Detailed expressions for the second-order solutions are given in [313] including 
the (small) dependences on for the unbiased case. 
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PT expansion itself (i.e., the expansion of <5(k) and 0(k) in terms of linear 
fluctuations hi(k)). Therefore, one is not guaranteed that the resulting PT 
in redshift space will work over the same range of scales as in real space. In 
fact, in general, PT in redshift space breaks down at larger scales than in real 
space, because the redshift-space mapping is only treated approximately, and 
it breaks down at larger scales than does the perturbative dynamics. In par¬ 
ticular, a calculation of the one-loop power spectrum in redshift space using 
Eqs. (611-613) does not give satisfactory results, because expanding the ex¬ 
ponential in Eq. (607) is a poor approximation. To extend the leading-order 
calculations, one must treat the redshift-space mapping exactly and only ap¬ 
proximate the dynamics using PT [562], To date, this program has only been 
carried out using the Zel’dovich approximation [220,642,301] and second-order 
Lagrangian PT [565], as we shall discuss below. 


7-4-2 The Redshift-Space Power Spectrum 

The calculation of redshift-space statistics in Fourier space proceeds along the 
same lines as in the un-redshifted case. To leading (linear) order, the redshift- 
space galaxy power spectrum reads [362] 

OO 

P s (k) =P g (k) (1 + /V) 2 = E^ W p g(k), (614) 

e =0 

where P g (k) = b\P{k ) is the real-space galaxy power spectrum, P(k) is the 
linear mass power spectrum, and (3 = f /b\ ~ jb\. Here Pe(fJ>) denotes the 
Legendre polynomial of order l, and the multipole coefficients are [290,131] 

a 0 = l + -(3 + ~(3 2 , a 2 =-f3 + -f3 2 , a 4 = — (3 2 ; (615) 

3 5 3 7 35 

all other multipoles vanish. Equation (614) is the standard tool for measuring 
Q m from redshift distortions of the power spectrum in the linear regime; in 
particular, the quadrupole-to-monopole ratio Rp = a 2 /a 0 should be a con¬ 
stant, independent of wavevector k, as k —> 0. Note, however, that in these 
expressions Q m appears only through the parameter (3, so there is a degeneracy 
between Q m and the linear bias factor b\. Equation (615) assumes determin¬ 
istic bias, for stochastic bias extensions see [517,180]. 

From equation (607), we can write a simple expression for the power spectrum 
in redshift space, P s (k): 


P.{ k) = 


d 3 i 


—ik-r / AXAV 7 


(2tt) 


h(x) + fV z v z (x) 1 [(5(x) + fV' z v z fx! 


(616) 


where A = fkp, Av z = v z (x) — v z (x'), r = x — x'. This is a fully non-linear 
expression, no approximation has been made except for the plane-parallel ap- 
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proximation. In fact, Eq. (616) is the Fourier analog of the so-called “stream¬ 
ing model” [508], as modified in [219] to take into account the density-velocity 
coupling. 

The physical interpretation of this result is as follows. The factors in square 
brackets denote the amplification of the power spectrum in redshift space due 
to infall (and they constitute the only contribution in linear theory, giving 
Kaiser’s [362] result). This gives a positive contribution to the quadrupole 
(l = 2) and hexacadupole (l = 4) anisotropies. On the other hand, at small 
scales, as k increases the exponential factor starts to play a role, decreasing 
the power due to oscillations coming from the pairwise velocity along the line 
of sight. This leads to a decrease in monopole and quadrupole power with 
respect to the linear contribution; in particular, the quadrupole changes sign. 
In order to describe the non-linear behavior of the redshift-space power spec¬ 
trum, it has become popular to resort to a phenomenological model to take 
into account the velocity dispersion effects [493]. In this case, the non-linear 
distortions of the power spectrum in redshift-space are written in terms of 
the linear squashing factor and a suitable damping factor due to the pairwise- 
velocity distribution function 


Ps( k) 


Pg{k) 


(1+/V) 2 

[1 + (k/ic q,) 2 /2] 2 


(617) 


Here a v is a free parameter that characterizes the velocity dispersion along 
the line-of-sight. This Lorentzian form of the damping factor is motivated by 
empirical results showing an exponential one-particlc | iUi | velocity distribution 
function [489]; comparison with N-body simulations have shown it to be a 
good approximation [132]; however, these type of phenomenological models 
tend to approach the linear PT result faster than numerical simulations [301]. 
In addition, although a v can be chosen to fit, say, the quadrupole-to-monopole 
ratio at some range of scales, the predictions for the monopole or quadrupole 
by themselves do not work as well as for their ratio. 

Accuracy in describing the shape of the quadrupole to monopole ratio as a 
function of scale is important, since this statistic gives a direct determination 
of (3 from clustering in redshift surveys [290,131,132,302], An alternative to 
phenomenological models, is to obtain the redshift-space power spectrum using 
approximations to the dynamics, as we now discuss. 

In the case of the Zel’dovich approximation (ZA), it is possible to obtain the 
redshift-space power spectrum as follows [220,642]. In the ZA, the density field 


101 Alternatively, if one assumes the two-particle velocity distribution is exponen¬ 
tial, the suppression factor is the square root of that in Eq. (617), with a v the 
pairwise velocity dispersion along the line of sight, see e.g. [18]. The observational 
results regarding velocity distributions and their interpretation is briefly discussed 
in Sect. 8.3.2. 
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obeys 


l + <S(x) = y<i 3 q <5 D [x-q-'l'(q)], (618) 

where T (q) is the displacement vector at Lagrangian position q. In the plane 
parallel approximation, one can treat redshift distortions in the ZA by noting 
that it corresponds to amplifying the displacement vector by / ~ along 
the line of sight; that is, the displacement vector in redshift space is T s (q) = 
T(q) + f/('I'(q) • z). Fourier transforming the corresponding expression to 
Eq. (618) in redshift space, the power spectrum gives 

P( k) = /d 3 qexp(,k.q) (exp(ik • AT)), (619) 

where Ad' = d'(qi) — T(q 2 ) and q = qi — q 2 - For Gaussian initial condi¬ 
tions, the ZA displacement is a Gaussian random held, so Eq. (619) can be 
evaluated in terms of the two-point correlator of T(q). The results of these cal¬ 
culations show that the ZA leads to a reasonable description of the quadrupole 
to monopole ratio [220,642] provided that the zero-crossing scale is fixed to 
agree with numerical simulations. In general, the ZA predicts a zero-crossing 
at wavenumbers larger than found in N-body simulations [301]. Furthermore, 
although the shape of the quadrupole to monopole ratio resembles that in the 
simulations, the monopole and quadrupole do not agree as well as their ratio. 
This can be improved by using second-order Lagrangian PT [571], but the 
calculation cannot be done analytically anymore, instead one has to resort to 
numerical realizations of the redshift-space density held in 2LPT. 


7.4-3 The Redshift-Space Bispectrum 

Given the second-order PT kernel in redshift-space, the leading-order (tree- 
level) galaxy bispectrum in redshift-space reads [313,669,562] 

fi s (k 1 ,k 2 ,k 3 ) = 2Z 2 (ki,k 2 ) Zi(ki) Zi(k 2 ) P{kf) P(k 2 ) + cyc., (620) 


which can be normalized by the power spectrum monopole to give the reduced 
bispectrum in redshift space, Q s , 


Qs(ki, k 2 , k 3 ) 


-B s (ki, k 2 , k 3 ) 

«o (P g (h 1 ) P g (k 2 ) + eye.)’ 


(621) 


where “eye.” denotes a sum over permutations of {k±, k 2 , k 3 }. Note that Q s is 
independent of power spectrum normalization to leading order in PT. Since, 
to leading order, Q s is a function of triangle configuration which separately 
depends on Q m , b , and b 2l it allows one in principle to break the degeneracy 
between kl m and b present in measurement of the power spectrum multipoles in 
redshift space [236,313]. The additional dependence of (the monopole of) Q s on 
Q m brought by redshift-space distortions is small, typically about 10% [313]. 
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Fig. 48. The left panel shows the bispectrum in redshift space for configurations 
with k 2 = 2fci as shown as a function of the angle 6 between ki and k 2 . The dotted 
line shows the predictions of second-order Eulerian PT, whereas the solid lines 
correspond to 2LPT. Error bars correspond to the average between 4 realizations. 
The right panel shows the bispectrum in redshift space for configurations with 
k 2 = 2 k\ = 1.04 h/Mpc, i.e. in the non-linear regime. Square symbols denote Q in 
real space, whereas triangles denote the redshift-space bispectrum. Also shown are 
the predictions of PT in real space (dashed lines), PT in redshift space (PTs, dotted 
line) and the phenomenological model with a v = 5.5, ( PT+oy, continuous line). 

On the other hand, as expected, the quadrupole of Q s shows a strong Q m 
dependence [562], 

Decomposing into Legendre polynomials, B s eq (/i) = ^s \q -PHaOj the 

redshift-space reduced bispectrum for equilateral configurations reads [562] 


Q 


0=o) 

s eq 


5 (2520 + 4410 7 + 1890 /3 + 2940 7 /? + 3 78 /3 2 + 441 7 /3 2 ) 
98 b\ (15 + 10/3 + 3 /3 2 ) 2 
, 5 (9 /3 3 + 1470 bi/3 + 882 fq (3 2 — 14 bi /3 4 ) 

+ 98 61 (15 + 10/3 + 3 /3 2 ) 2 ’ 


(622) 


where 7 = b 2 /bi. This result shows that in redshift space, Q S)3 7^ (Q s + 7) /61 
as in Eq. (528), although it is not a bad approximation [562], In the absence 
of bias (61 = 1, 7 = 0), Eq. (622) yields 

(, =0) = 5 (2520 + 3360 / + 1260 / 2 + 9 / 3 - 14 / 4 ) 

^ seq 98(15 +10/+ 3 / 2 ) 2 

which approaches the real-space result [232] Q eq = 4/7 = 0.57 in the limit / ~ 
^rn 0- O n the other hand, for / = Q m = 1, we have = 0.464: for these 
configurations, the reduced bispectrum is suppressed by redshift distortions. 
As discussed before for the power spectrum, leading-order calculations in red¬ 
shift space have a more restricted regime of validity than in real space, due 
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to the rather limited validity of the perturbative expansion for the redshift- 
space mapping (instead of the perturbative treatment of the dynamics). The 
same situation holds for the bispectrum. The left panel in Fig. 48 shows the 
reduced bispectrum Q as a function of angle 6 between k 3 and k 2 for config¬ 
urations with h '2 = 2ki = 0.21h/Mpc. The dotted line shows the predictions 
of tree-level PT in redshift space, Eq. (621), whereas the symbols correspond 
to N-body simulations of the ACDM model (fi m = 0.3, = 0.7, cr$ = 0.7) 

with error bars obtained from 4 realizations. The disagreement is most serious 
at colinear configurations. On the other hand, the solid lines obtained using 
2LPT [565] agree very well with the N-body measurements. Similarly good 
agreement is found for equilateral configurations. The key in the 2LPT pre¬ 
dictions is that the redshift-space mapping is done exactly (by displacing the 
particles from real to redshift space in the numerical realizations of the 2LPT 
density held), rather than expanded in power series. 

At small scales, however, 2LPT breaks down and one must resort to some kind 
of phenomenological model to account for the redshift distortions induced by 
the velocity dispersion of clusters. For the bispectrum, this reads [562] 


B s (ki, k 2 , k 3 ) 


_ ^ PT (k!,k 2 ,k 3 ) _ 

[1 + a 2 pi/xi) 2 + (/c 2 /i 2 ) 2 + (/c 3 /i 3 ) 2 ] 2 cr 2 /2] 2 ' 


(624) 


where _Bj T (ki, k 2 , k 3 ) is the tree-level redshift-space bispectrum. The assump¬ 
tion is that one can write the triplet velocity dispersion along the line-of- 
sight in terms of the pairwise velocity dispersion parameter <j v and a con¬ 
stant a which reflects the configuration dependence of the triplet velocity 
dispersion. As noted above, a v is determined from simulations solely using the 
power spectrum ratio; the parameter a is then fitted by comparison with the 
monopole-to-quadrupole ratio of the equilateral bispectrum measured in the 
simulations [562]. A somewhat different phenomenological model can be found 
in [669]; in addition [435] studies using a similar phenomenological model the 
effects of redshift-space distortions in the nonlinear regime for the three-point 
correlation, assuming the validity of the hierarchical model in real space. 

The right panel in Fig. 48 shows the redshift-space bispectrum at small scales, 
to show the effects of non-linear redshift distortions. The square symbols de¬ 
note Q is real space, which approximately saturates to a constant independent 
of configuration. On the other hand, the redshift-space Q shows a strong con¬ 
figuration dependence, due to the anisotropy of structures in redshift space 
caused by cluster velocity dispersion (fingers of God). The phenomenological 
model (with a v = 5.5 and a = 3) in solid lines does quite a good job at 
describing the shape dependence of Q s . 

Similar studies using numerical simulations have been carried out in terms 
of the three-point correlation function, rather than the bispectrum, to as¬ 
sess the validity of the hierarchical model in the nonlinear regime in redshift 
space [437,614] and to compare with redshift surveys at small scales [84,267,347] 
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They obtained analogous results to Fig. 48 for the suppression of Q s for equi¬ 
lateral configurations compared to Q at small scales due to velocity disper¬ 
sion. However, studies of the configuration dependence of Q s in the non-linear 
regime [437,614,347] find no evidence of the configuration dependence shown 
in the right panel in Fig. 48. This is surprising, as visual inspection of numer¬ 
ical simulations shows clear signs of fingers of God; this anisotropy should be 
reflected as a configuration dependence of Q s . More numerical work is needed 
to resolve this issue. 

7-4-4 The Higher-Order Moments in Redshift Space 

In redshift space, the PT calculation of skewness and higher-order cumulants 
cannot be done analytically, unlike the case of real space, but can be done by a 
simple numerical integration for S 3 [313] [[^]. The effects of redshift distortions, 
however, are very small (of order 10 %) for the skewness and kurtosis. 

On the other hand, at small scales the effect of non-linear redshift distortions 
is quite strong; since non-linear growth is suppressed in redshift space due 
to cluster velocity dispersion, the skewness and higher-order moments do not 
grow much as smaller scales are probed [391,437,614,84,554], Figure 49 shows 
an example for the S p parameters for top-hat smoothing (p = 3,4, 5) in the 
ACDM model; square symbols denote the real-space values and triangles cor¬ 
respond to redshift-space quantities. Note the close agreement between real 
and redshift space at the largest scales, and the suppression at small scales for 
the redshift space case. The latter looks almost scale independent; however, 
one must keep in mind that correlation functions at small scales should be 
strongly non-hicrarchical, i.e. depend strongly on configuration as shown in 
the right panel in Fig. 48. 

7-4-5 Cosmological Distortions 

Deep galaxy surveys can probe a large volume down to redshifts where the 
effects of a cosmological constant, or more generally dark energy, become 
appreciable. A geometrical effect, as first suggested in [4], arises in galaxy 
clustering measures because the assumption of an incorrect cosmology leads 
to an apparent anisotropy of clustering statistics. I 11 particular, structures 
appear flattened along the line of sight, and thus the power spectrum and 
correlation functions develop anisotropy, similar to that caused by redshift 
distortions [18,438,526,181,442], The challenge to measure this effect is that 
redshift distortions are generally larger than cosmological distortions, so a re¬ 
liable measure of cosmological distortions require an accurate treatment of 
redshift distortions. 

Recent work along these lines [444], using the approximation of linear PT 

105 Using a different approach, [682] recently derived a closed form for S 3 in redshift 
space that does not agree with [313]. This apparent disagreement merits further 
work. 
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Fig. 49. The S p parameters for p = 3,4, 5 (from bottom to top) in redshift space for 
ACDM with cr 8 = 0.9 as a function of smoothing scale R. Square symbols denote 
measurements in real space N-body simulations, whereas triangles correspond to 
redshift-space values, assuming the plane-parallel approximation. 

and that bias is linear, scale and time independent, concludes that the best 
prospects for measuring cosmological distortions in upcoming surveys is given 
by the LRG (luminous red galaxy) sample of the SDSS. This sample of about 
100,000 galaxies seems to give a good balance between probing structure down 
to ‘high’ redshift (z ~ 0.5) and having a large enough number density so 
that shot noise is not a limiting factor. Analysis of redshift and cosmological 
distortions gives a joint 3-cr uncertainty on Ha and Q m of about 15%, assuming 
Ha = 0.7 and = 0.3 as the fiducial model. Other surveys such as the quasar 
samples in 2dFGRS and SDSS, are predicted to give less stringent constraints 
due to the sparse sampling [444], 

Applications of cosmological distortions to the case of the Lyman-a forest 
have been proposed in the literature [329,448]. In this case, the distortions are 
computed by comparing correlations along the line of sight to those by cross- 
correlating line of sights of nearby quasars. These studies conclude that with 
only about 25 pairs of quasars at angular separations of < 2' — 3' it is possible 
to distinguish an open model from a flat cosmological constant dominated 
model (with the same Q m = 0.3) at the 4-cr level. These results, however, 
assume a linear description of redshift distortions. More recent analysis using 
numerical simulations [449] suggests that with 13(9/1') 2 pairs at separation 
less than 9, and including separations < 10', a measurement to 5% can be made 
if simulations can predict the redshift-space anisotropy with 5% accuracy, or 
to 10% if the anisotropy must be measured from the data. 

Finally, we should mention the effect of clustering evolution along the line of 
sight, due to observation along the light cone. Estimates of this effect show 
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that for wide surveys such as 2dFGRS and SDSS it amounts to about 10% in 
the power spectrum and higher-order statistics, while it becomes significantly 
larger of course for deep surveys, see e.g. [439,283]. 
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8 Results from Galaxy Surveys 


8.1 Galaxies as Cosmological Tracers 

Following the discovery of galaxies as basic objects in our universe [547,322,323], 
it became clear that their spatial distribution was not uniform but clustered 
in the sky, e.g. [709]. In fact, the Local Supercluster was recognized early on 
from two-dimensional maps of the galaxy distribution [184], The first mea¬ 
surements ever of the angular two-point correlation function w(9), done in 
the Lick survey [653], established already one of the basic results of galaxy 
clustering, that at small scales the angular correlation function w{6) has a 
power-law dependence in 6 [see Eq. (625) below]. 

The first systematic study of galaxy clustering was carried out in the 1970’s 
by Peebles and his collaborators. In a truly groundbreaking twelve-paper se¬ 
ries [500,303,503,501,504,505,275,575,576,226,577,227], galaxies were seen for 
the first time as tracers of the large-scale mass distribution in the gravita¬ 
tional instability frame work | 1 1111 1 . These works confirmed (and extended) the 
power-law behavior of the angular two-point function, established its scaling 
with apparent magnitude, and measured for the first time the angular power 
spectrum and the three- and four-point functions which were found to follow 
the hierarchical scaling w n ~ wf -1 . The theoretical interpretation of these 
observations was done in the framework of galaxies that traced the mass dis¬ 
tribution in an Einstein de-Sitter uni verse f 107 1 . 

These results, however, relied on visual inspections of poorly calibrated photo¬ 
graphic plates; i.e. with very crude magnitudes (e.g., Zwicky) or galaxy counts 
(e.g., Lick) estimated by eye, rather than by some automatic machine. These 
surveys were the result of adding many different adjacent photographic plates 
and the uniformity of calibration was a serious issue, since large-scale gra¬ 
dients can be caused by varying exposure time, obscuration by our galaxy, 
and atmospheric extinction. These effects are difficult to disentangle from 
real clustering, attempts were made to reduce them with smoothing proce¬ 
dures, but this could also result in a removal of real large-scale clustering. 
More than 20 years after completion of Zwicky and Lick surveys, there were 
major technological developments in photographic emulsions, computers and 
automatic scanning machines, such as the APM (Automatic Plate Measuring 

106 For an exhaustive review of this and earlier work see [210,508]. 

10 ' In this case self-similarity plus stable clustering leads to hierarchical scaling in 
the highly non-linear regime, although it does not explain why hierarchical am¬ 
plitudes are independent of configuration, see Sect. 4.5. These observations were 
partially motivated by work on the BBGKY approach to the dynamics of gravita¬ 
tional instability [548] and also generated a significant theoretical activity that led 
to much of the development of hierarchical models. For a recent historical account 
of these results and a comparison with current views in the framework of biased 
galaxy formation in CDM models see [515]. 
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Machine, [374]) and COSMOS [421] micro-densitometers. This allowed a bet¬ 
ter calibration of wide held surveys, as measuring machines locate sources on 
photographic plates and measure brightness, positions and shape parameters 
for each source [520,519,582,311,384,604,533,574], 

In the 1980’s large number of redshifts and scanning machines gave rise to a 
second generation of wide-held surveys, with a much better calibration and a 
three-dimensional view of the universe f 108 1 . The advent of CCD’s revolutionized 
imaging in astronomy and soon made photographic plate techniques obsolete 
for large scale structure studies. Nowadays, photometric surveys are done with 
large CCD cameras involving millions of pixels and can sample comparable 
number of galaxies. Furthermore, it is possible with massive multi-fiber or 
multi-slit spectroscopic techniques to build large redshift surveys of our nearby 
universe such as the LCRS [584] the 2dFGRS (e.g. see [142]) or the SDSS 
(e.g. see [699]) as well as of the universe at higher redshifts such as in the 
VIRMOS (e.g. see [398]) and DEIMOS surveys (e.g. see [177]). 

This significant improvement in the quality of surveys and their sampled vol¬ 
ume allowed more accurate statistical tests and therefore constrain better 
theories of large-scale structure. Stringent constraints from upper limits to 
the CMB anisotropy (e.g. [657]), plus theoretical inputs from the production 
of light elements (e.g. [696]) and the generation of fluctuations from infla¬ 
tion in the early universe [602,304,280,20] led to the development of CDM 
models [509,75] where most of the matter in the universe is not in the form 
of baryons. The three-dimensional mapping of large scale structures in red- 
shift surveys showed a surprising degree of coherence [378,324,182] which when 
compared with theoretical predictions of the standard CDM model (e.g. [173]) 
led to the framework of biased galaxy formation, where galaxies are not faith¬ 
ful tracers of the underlying dark matter distribution (Sect. 7.1). Subsequent 
observational challenge from the angular two-point function in the APM sur¬ 
vey [422] and counts in cells in the IRAS survey [200,549] led to the demise of 
standard CDM models in favor of CDM models with more large-scale power, 
with galaxies still playing the role of (mildly) biased tracers of the mass dis¬ 
tribution (e.g. [174]). 

The access to the third dimension also allowed analyses of peculiar veloc¬ 
ity statistics through redshift distortions [362,290] (Sect. 7.4, see [295] for a 
recent review) and measurements of higher-order correlations became more 
reliable with the hierarchical scaling (Sect. 4.5.5), £?v Qn £,2 \ being es¬ 

tablished by numerous measurements in 3D catalogs [31,246,92,239]. How¬ 
ever, it was not until recently that surveys reached large enough scales to 
test the weakly non-linear regime and therefore predictions of PT against ob¬ 
servations [224,249,248,250,225,567,211], This is an important step forward, 
as higher-order statistics encode precious additional information that can be 
used to break degeneracies present in measurements of two-point statistics, 


108 For a review of redshift surveys see e.g. [484,268,607,608]. 
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constrain how well galaxies trace the mass distribution, and study the statis¬ 
tics of primordial fluctuations. It is the purpose of the present chapter to 
review the observational efforts along these lines. 

In this Chapter, we discuss the various results obtained from measurements 
in galaxy catalogs for traditional statistics such as IV-point correlation func¬ 
tions in real and Fourier space and counts-in-cells cumulants (thus leaving 
out many results on the shape of the CPDF itself, including the void prob¬ 
ability function). We do not attempt to provide a comprehensive review of 
all relevant observations but rather concentrate on a subsample of them. The 
choice reflects the connections to PT and thus there is a strong emphasis on 
higher-order statistics. In particular, we do not discuss about cosmic velocity 
fields, except when redshift distortions are a concern. Also, we do not discuss 
the spatial distribution of clusters of galaxies since the statistical significance 
of measurements of higher-order statistics is still somewhat marginal. 

The remainder of this chapter is mainly divided into two large sections, one 
concerning angular surveys (Sect. 8.2), the other one concerning redshift sur¬ 
veys (Sect. 8.3). Finally, Sect. 8.4 reviews ongoing and future surveys. 


8.2 Results from Angular Galaxy Surveys 
8 .2.1 Angular Catalogs 

We begin our discussion of angular clustering with a brief description of results 
from the older generation of catalogs that sets the stage for the more recent 
results, and then go into a more detailed description of the current state of 
the subject. Table 14 lists the main angular catalogs that have been exten¬ 
sively analyzed. We show the characteristic parameters of the samples used in 
the relevant clustering analyses. The information is organized as follows. The 
second column gives the total area, fl, of the catalog while the fourth column 
shows its mean depth, D (associated with the limiting magnitude in the third 
column). The fifth column gives the volume in terms of a characteristic length, 
D Fj . The sixth column gives the surface density, n g , which also relates to the 
mean depth. The three numbers, Q, D and D F control volume (area) and edge 
effects discussed in Chapter 6. In particular samples with similar volumes can 
have quite different sampling biases due to edge effects, because of differences 
in the shape (angular extent) of the survey. The galaxy number density, n g , 
relates to discreteness errors (Chapter 6), which of course are more significant 
when the total number of objects in the catalog is small. Finally, let us note 
that some of these catalogs were constructed with different photometric filters 
(typically blue). 

The original Zwicky catalog ([710] 1961-1968) contains galaxies to magnitude 
m < 15.7. In the most angular clustering analyses only galaxies brighter than 
m = 14.5 — 15 (with ~ 2000 gal/sr) and only in the North galactic cap (f! ~ 
1.8 ster) were used. The mean depth is estimated to be about 50-80 Mpc /h. 
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Table 14 

Angular Catalogs. The first 5 entries correspond to “old” catalogs (1961-1974) based 
on counts or magnitude/diameters estimates by eye and with poor calibration. The 
survey Area hi is given in steradians, the depth (mean luminosity distance) and 
effective size Tg = (f2/47r) 1//3 2.D are in Mpc /h. The sign ~ reflects the fact that 
different sub-samples have different values for that quantity. 


Name 

Area 

magnitudes 

Depth D 

D e 

# gal/ster 

Ref 

Zwicky 

1.8 ster 

m z < 15 

70 

73 

~ 7000 

[710] 

Lick 

3.3 ster 

m < 19 

220 

280 

~ 10 5 

[581] 

Jagellonian 

0.01 ster 

m < 21 

400 

74 

~ 10 6 

[543] 

ESO/Uppsala 

~ 1.8 ster 

di > 1' 

60 

63 

~ 2000 

[317] 

UGC 

~ 1.8 ster 

di > V 

70 

74 

~ 2000 

[478] 

APM 

1.3 ster 

bj = 17- 20 

400 

380 

~ 10 6 

[422] 

EDSGC 

0.3 ster 

bj = 17- 20 

400 

230 

~ 10 6 

[144] 

IRAS 1.2Jy 

9.5 ster 

faOftm 7> 1 -2>7y 

80 

145 

480 

[218] 

DeepRange 

0.005 ster 

Iab < 22.5 

2000 

150 

~ 10 8 

[530] 

SDSS 

~ 3 ster 

r' < 22 

1000 

1300 

~ 10 7 

[699] 


The base sample used for redshift surveys (see Sect. 8.3) is a wide survey (f! ~ 
2.7) with about 20000 galaxy positions (m < 15.5) taken from photographic 
plates with different calibrations. There have been several studies of systematic 
errors in Zwicky photometry, showing an important magnitude scale error (see 
[257] and references therein), however, it is not clear how seriously this affected 
the clustering properties. 

The Lick catalog ([581] 1967) consists of 1246 plates of 6x6 square degrees. 
Counts were done by eye. In the analyses presented by Peebles and collabo¬ 
rators, only 467 plates with |6n| > 40 degrees were used. These plates have 
overlapping regions which were used to reduce the counts to a uniform limiting 
magnitude. Calibration was based on matching the surface density of counts, 
(n), which is much less reliable than calibration based on comparing posi¬ 
tions and magnitudes of individual sources. Errors on count estimates were 
assumed to be independent from cell to cell and to increase the variance by an 
additive factor proportional to (n). In [275] large-scale gradients in the counts 
were removed by applying a “smoothing factor” which led to some controversy 
concerning the significance of the analysis [265,183,276,277]. 
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The Jagellonian field ([543] 1973) consists of a 6 x 6 square degrees area with 
galaxy counts in cells of 3.75' x 3.75', e.g. in a 98 x 98 grid (higher resolution 
and deeper than in the Lick catalog). There was no attempt to correct for 
the lack of uniform optics and plate exposure across the large field of view 
(e.g. vignetting effects). Although this is quite a deep survey, its angular extent 
is small and it is clear that it should suffer significantly from the volume and 
edge effects described in Chapter 6. 

The ESO/Uppsala [317] and LTGC [478] catalogs are based on several hundreds 
of copies of large (ESO/Palomar) Schmidt plates. Galaxies were found with a 
limiting visual diameter of about 1'. There is evidence for the selection function 
to depend on declination, which has to be taken into account while inverting 
the angular correlations (e.g. see [345]). Compensation for this effect is likely 
to produce large scale artifacts, especially because the sample is relatively 
small. 

The APM galaxy catalog ([422] 1990) is based on 185 UK IIIA-J Schmidt 
photographic plates, each corresponding to 6 x 6 square degrees on the sky to 
bj = 20.5 and mean depth of 400 Mpc /h (a factor of two deeper than the Lick 
catalog) for b < —40 degrees and <5 < —20 degrees. These fields were scanned 
by the APM machine [374], Galaxy and star magnitudes and positions in the 
overlapping regions (of 1 degree per plate) were used to match all plates to a 
single calibration/exposure. Because there are calibration errors for individual 
galaxies and positions in a plate, a more careful analysis of vignetting and 
variable exposure within a plate could be done (as compared to just using 
the counts). The resulting matching errors can be used to perform a study 
of the biases induced in the clustering analysis. In the results shown here, an 
equal-area projection pixel map was used with a resolution of 3.5' x 3.5' cells. 

The EDSGC Survey ([144] 1992) consists of 60 LTK IIIA-J Schmidt photo¬ 
graphic plates corresponding to 6 x 6 square degrees on the sky to bj = 20.5 
and mean depth of 400 Mpc /h. In fact, the raw photographic plates are the 
same in both the APM and EDSGC catalogs, but the later only includes 
scans of a fraction (1/3) of the APM plates, in the central part. The EDSGC 
database was constructed from COSMOS scans [421], with different calibra¬ 
tion and software analyses. Therefore these two catalogs can be considered as 
fairly independent realizations of the systematic errors. 

The IRAS 1.2 Jy ([606] 1990) is a redshift subsample of the IRAS Point Source 
Catalog [123] and is included here because it has also been used to measure 
angular clustering. This catalog belongs to a newer generation of wide field 
surveys, where magnitudes and positions of objects have been obtained by 
automatic measurements. The CfA [324] and SSRS [168] redshift catalogs have 
also been used to study angular clustering. More details on redshift samples 
will be given in Sect. 8.3.1. 
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The Deep Range Catalog ([530] 1998) consists of 256 overlapping CCD images 
of 16 arc minutes on a side, including 1 arc minute overlap to allow the relative 
calibration of the entire survey. Images were taken to Jab < 24 with a total 
area extending over a contiguous 4x4 square degrees region. The median 
redshift for the deeper slices used in the clustering analysis, Jab = 22 — 22.5 
is z ~ 0.75 which corresponds to a depth of approximately 2000 hr x Mpc. The 
Jab = 17 — 18 slice has z — 0.15, i.e. a similar depth to the APM catalog. 
Note the large surface density of this survey. Although this is quite a deep 
survey its angular extent is rather small and it suffers from the volume and 
edge effects described in Chapter 6, especially at the brighter end. 

The Sloan Digital Sky Survey (SDSS, eg see [699]) was still under construction 
when this review was written and only preliminary results are known at this 
stage. These results are discussed in a separate section, see 8.4 for more details. 

Smaller, but otherwise quite similar in design to DeepRange, wide mosaic op¬ 
tical catalogs have been used to study higher-order correlations. For example, 
the INT-WFC [540] with ~ 70000 galaxies to R < 23.5 over two separated 
fields of 1.01 and and 0.74 square degrees. There are a number of such surveys 
currently under analyses or in preparation, such as the FIRST radio source 
survey [33], the NOAO Deep Wide-Field Survey [340], the Canada-France 
Deep Fields [447], VIRMOS [398], DEIMOS [177] or the NR AO VLA Sky 
Survey [156]. 

Most of the catalogs described above have magnitude information, allowing 
one to study subsets at different limiting magnitudes or depth. This can be 
used for instance to test Limber equation [Eq. (569)] and the homogeneity of 
the sample [275,422], Even with the new generation of better calibrated sur¬ 
veys, there has been some concerns about variable sensitivity inside individual 
plates in the APM and EDSGC catalogs [187] and some questions regarding 
large-scale gradients in the APM survey have been raised [221], Later analysis 
checked the APM calibration against external CCD measurements over 13000 
galaxies from the Las Campanas Deep Redshift Survey showing an rms error 
in the range 0.04-0.05 magnitudes [425]. These studies concluded that atmo¬ 
spheric extinction and obscuration by dust in our galaxy have negligible effect 
on the clustering and also gave convincing evidence for the lack of systematics 
errors. 

8.2.2 The Angular Correlation Function and Power Spectrum 
The angular two-point correlation function in early surveys was estimated from 
the Zwicky catalog, Jagellonian field and Lick survey in [653,503,505,275,272], 
For catalogs with pixel maps (counts in some small cells), such as Lick and 
Jagellonian, the estimators used were basically factorial moment correlators 
as described in Sect. 6.8, whereas for catalogs with individual galaxy positions 
(such as Zwicky) the estimators were based on pair counts as discussed in 
Sect. 6.4.1. 
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The angular two-point function was found to be consistent between the Zwicky, 
Lick and Jagcllonian samples. For a wide range of angular separations, the 
estimates were well fitted by a power-law: 


w 2 (d)^e 1 ~ 7 , 7 ~ 1.77 ±0.04 


(625) 


The resulting 3D two-point function, after using Limber’s equation [Eq. (569)] 
for the deprojection of a power-law model, gives consistent results for all cat¬ 
alogs with: 



(626) 


for scales between 0.05h -1 Mpc < r < 9h -1 Mpc [505,275]. On the largest 
scales, corresponding to r > 10 h^ 1 Mpc, the results are quite uncertain be¬ 
cause correlations are small and calibration errors become relevant. The re¬ 
sults in [275] suggested a break in £ 2 (r) for r > 10/r -1 Mpc. The position 
of this break, however, depends on the smoothing corrections applied to the 
Lick catalog (which is the one probing the largest scales) on angles 6 > 3 
degrees [276,277]. 

Several other groups have measured small numbers of Schmidt and 4-m plates 
to produce galaxy surveys of a few hundred square degrees down to bj ~ 20 
and a few square degrees down to bj ~ 23 [582,311,384,604,533,574], Most 
of these studies also show a power-law behavior with consistent values and a 
sharp break at large scales, the location of the latter depending on the size of 
the catalog f 709 ] . This sharp break, expected in CDM models, is at least in part 
caused by finite volume effects, i.e. the integral constraint discussed in e.g. 
Sect. 6.4.2 | llu | . Thus most of these analyses show uncertain estimations for w 2 
in the weakly non-linear regime, which is also the case for the ESO/Uppsala 
and UGC catalogs [345]. 

The APM catalog has enough area and depth to probe large scales in the 
weakly non-linear regime. The first measurements of the angular two-point 
correlation function [422] led to the discovery of “extra” large-scale power 
(corresponding to shape parameter ] 111 1 T ~ 0.2), significantly more than in 

109 More recent studies using CCD cameras, find that the power-law form of the 
small-scale angular correlation function remains in deep samples with amplitude 
decreasing with fainter magnitudes [161,539,325,530,447], with indications of a less 
steep power-law at the faint end, Iab 23 (e.g. [332,530,447]). 

110 It is worth pointing out that the cosmic bias caused by the small volume, the 
boundary or shot-noise in the sample typically yields lower amplitudes of w 2 for 
the smaller (nearby) samples. This has been noticed by several authors (e.g. [175], 
Fig. 3 in [345]) and sometimes interpreted as a real effect (see also Fig. 8 in [328]). 

111 The shape parameter when the contribution of baryons is neglected (Dj, <C fi m ) 
reads T ~ D. m h, see e.g. [201,76,21], However, for currently favored cosmological 
parameters it is more accurate to use T = exp[—1^(1 + \/2h/£2 m )] x Q m h [611]. 
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Fig. 50. The two-point angular correlation function WziO) (squares with error-bars), 
estimated from counts-in-cells and pair-counts in the APM map compared with a 
power-law W 2 ~ e -° 7 (dashed line). Errors are from the dispersion in 4 disjoint 
subsamples within the APM. The lower panel shows the ratio of the values in each 
zone to the average value in the whole sample. 

the standard CDM model (T = 0.5). This result has been confirmed by mea¬ 
surement of ($) hi the EDSGC catalog [144], and subsequent analyses of the 
inferred 3D power spectrum from inversion of the APM angular correlation 
function [26] and angular power spectrum [27] and inversion from W 2 (9) to the 
3D two-point function [29] (see Sect. 8.2.3 for a brief discussion of inversion 
procedures). Both APM and EDSGC find more power than the Lick catalog 
on scales 9 > 2 degrees, suggesting that the Lick data were overcorrected for 
possible large scale gradients [422-425]. 

Figure 50 shows the two-point angular correlation function W2(9) estimated 
for 9 > 1 degree from counts in the pixel maps (i.e. the factorial moment 
correlator W n , see Sect. 6 . 8 ) and at smaller scales from galaxy pair counts 
(using the DD/DR — 1 estimator, see Sect. 6.4.1). A fit of the two-point 
angular correlation with a power-law W 2 — A9 l ~ 1 , for scales 9 < 2 degrees 
gives A ~ 2.7 x 1CT 2 and 7 ~ 1.7 (shown as a dashed line). After inverting 
the Limber equation, the corresponding 3D two-point correlation function is 
in good agreement with Eq. (626), with a slightly flatter slope 7 ~ 1.7. The 
uncertainty in the value of the correlation length r 0 is controlled mainly by 
the accuracy in the knowledge of the selection function in Eq. (569) and by 
the cosmic errors that we discuss below. 

The APM data show good match between several disjoint magnitude slices 
when scaled according to the Limber equation to the same depth (see Fig. 25 
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in [425]). The agreement is good up to very large scales 9D > 40/r _1 Mpc; 
this indicates that the APM catalog can be used to explore the weakly non¬ 
linear regime. Similar conclusions apply to the EDSGC catalog (see [144]), 
which is compared in terms of w 2 ($) to APM in [425] (see also [328]): both 
catalogs agree well for 0.1 < 9 < 0.5 degrees. At larger angular scales, the 
EDSGC results differ from APM, essentially because of finite volume and 
edge effects due to its smaller area. More worrisome is that at smaller scales, 
9 < 0.1 degrees there are also discrepancies (presumably related to deblending 
of galaxies in high-density regions, see [627]) which can be quite significant for 
higher-order moments as we shall discuss in Sect. 8.2.5. 

The errors shown in Fig. 50 are obtained from the scatter among four disjoint 
subsamples in the APM, which is often an overestimate of the true cosmic 
errors at large scales (see end of Sect. 6.4.3). However, as discussed at length 
in Chapter 6, error bars give only a partial view of the real uncertainties (es¬ 
pecially in the case of spatial statistics), since measurements at different scales 
are strongly correlated. This is illustrated in the bottom panel of Fig. 50, where 
the variations of the measured u> 2 from subsample to subsample are coherent 
(and quite significant at the largest scales where edge effects become impor¬ 
tant). As a result, the values of W 2 change mostly in amplitude and to a lesser 
extent in slope from zone to zone. These cross-correlations are not negligible 
and must be taken into account to properly infer cosmological information 
since the measurements at different scales are not statistically independent. 
Only very recently the effect of the covariance between estimates at differ¬ 
ent scales was included in the analyses of APM [204,203] and EDSGC [331] 
angular clustering, by focusing on large-scales and using the Gaussian approx¬ 
imation to the covariance matrix, similar to Eq. (403). We discuss these results 
in the next section. 

Finally, note that the nearly perfect power-law behavior of the angular corre¬ 
lation function imposes non-trivial constraints on models of galaxy clustering. 
Since in CDM models the dark matter two-point correlation function is not a 
power-law, this implies that the bias between the galaxy and mass distribution 
must be scale dependent in a non-trivial way. The current view (see discussion 
in Sect. 7.1.4) is that this happens because the number of galaxies available in 
a given dark matter halo scales with the mass of the halo as a power-law with 
index smaller than unity. In these scenarios, the fact that the galaxy two-point 
function follows a power-law is thus a coincidence. Given the accuracy of the 
power-law behavior (see Fig. 50) this situation is certainly puzzling, it seems 
unlikely that such a cancellation can take place to such an accuracy]]^]. On 
the other hand, these models predict at small-scales that galaxy velocity dis¬ 
persions and S p parameters are significantly smaller than for the dark matter, 
as observed. We shall come back to discuss this in more detail below. 


112 However, one must keep in mind that features in the spatial correlation function 
can be significantly washed out due to projection, as first emphasized in [209]. 
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k (Mpc/h) 


Fig. 51. The APM 3D power spectrum reconstructed from W2{0). The continuous 
line shows a linear P(k ) reconstruction. The short and long dashed lines show linear 
CDM models with T = 0.2 and T = 0.5, normalized to the data at k ~ 0.3/i/Mpc. 

8.2.3 Inversion from Angular to 3D Clustering 

The cosmological information contained in the angular correlation function 
of galaxies can be extracted in basically two different ways. One is to just 
project theoretical predictions and compare to observations in angular space. 
It also is useful to carry out the alternative route of an inversion procedure 
from Eq. (570) to recover the 3D power spectrum, and compare to theoretical 
predictions in the more familiar 3D space. This has the advantage that it is 
possible to carry out parameter estimation on the scales not affected by non¬ 
linear evolution | ild | . To successfully apply this method, however, one must be 
able to propagate uncertainties from angular space to 3D space in a reliable 
way. Recent work has developed techniques that make this possible. 

To go from the angular correlation function to the 3D power spectrum (or 
two-point function) requires the inversion of an integral equation with a nearly 
singular kernel, since undoing the projection is unstable to features in the 3D 
correlations that get smoothed out due to projection. The inverse relation 
between £ 2 ( r ) and u> 2 (0) can be written down formally using Mcllin trans¬ 
forms [209,490], however in practice this result is difficult to implement since it 
involves differentiation of noisy quantities. Most inversions from w(6) [26,29,253] 

113 In angular space, this distinction is harder to make due to projection, particularly 
for the two-point correlation function. For example, for APM, w(6) at 6 = 1,2,3, 5 
degrees has contributions from 3D Fourier modes up to k = 1, 0.4,0.3, 0.2 h/Mpc, 
respectively [188]. 
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and the angular power spectrum [27] in the APM survey used an iterative de- 
convolution procedure suggested by Lucy [413] to solve integral equations. 
However, although Lucy’s method can provide a stable inversion, it does not 
provide a covariance matrix of the recovered 3D power spectrum. Error bars 
on the reconstructed 3D power spectrum have been estimated by computing 
the scatter in the spectra recovered from four different zones of the APM sur¬ 
vey [26,27]; this can only be considered as a crude estimate and cannot be used 
to constrain cosmological parameters in terms of rigorous confidence intervals. 
A number of methods have emerged in the last couple of years to over¬ 
come these limitations. These techniques involve some way of constraining 
the smoothness of the 3D power spectrum to suppress features in it that lead 
to minimal effects on the angular clustering and thus make the inversion pro¬ 
cess unstable. A method using a Bayesian prior on the smoothness of the 3D 
power spectrum was proposed in [188]. An improved method, based on SVD 
decomposition [204], identifies and discards those modes that lead to instabil¬ 
ity. Both methods give the covariance matrix for the estimates of the 3D power 
spectrum given a covariance matrix of the angular correlations, which can be 
done beyond the Gaussian approximation. The resulting 3D covariance matrix 
shows significant anti-correlations between neighboring bins [188,204]; this is 
expected since oscillatory features in the power spectrum are washed out by 
projection and thus are not well constrained from angular clustering data. 
Another technique based on maximum likelihood methods for performing the 
inversion is presented in [203] (see e.g. discussion in Sect. 6.11). This has the 
advantage of being optimal for Gaussian fluctuations, on the other hand, the 
assumption of Gaussianity means that errors and their covariances are un¬ 
derestimated at scales affected by non-linear evolution where non-Gaussianity 
becomes important. Including the covariance matrix of angular correlations 
showed that constraints on the recovered large-scale 3D power spectrum of 
APM galaxies become less stringent by a factor of two [204,203] compared to 
some of the previous analyses that assumed a diagonal covariance matrix. 
Figure 51 displays the APM 3D power spectrum P(k) reconstructed from the 
angular two-point correlation function [26,253] inverting Limber’s Eq. (570) 
using Lucy’s method. The errorbars are obtained from the dispersion on iy 2 (0) 
over four zones as shown in Figure 50 and should thus be considered as a 
crude estimate, especially at large scales (see [203] for comparison of errors in 
different inversion methods). The solid curve corresponds to a reconstruction 
of the linear part of the spectrum, which can be fitted by: 

P,f™(A;) - 7 x 10 5 ---r. (627) 

J [1 + (fc/0.05) 2 ] 1 ' 6 

for k < 0.6/i/Mpc, and Q m = 1 [30]. This linearization has been obtained 
assuming no bias between APM galaxies and dark matter | TT " r | . following the 

114 Unfortunately, as shown in [30], this assumption is inconsistent at small scales: 
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linearization first done in [289] and extended in [493] based on the mapping 
from the linear to non-linear power spectrum (see e.g. Sect. 4.5.4 for a discus¬ 
sion). Equation (627) has been obtained by running N-body simulations and 
agrees well with the mapping prescription of [335]. Note how non-linear effects 
become important at k > 0.1/r/Mpc | 115 | . 

As can be seen from Fig. 51, a comparison to CDM models on linear scales 
(k < 0.3 h/Mpc) favors low values of power-spectrum shape parameter T, 
showing more power on these scales than the standard CDM model with 
T = 0.5. Indeed, the most recent analyses including the effects of the covari¬ 
ance matrix discussed above concludes using the deprojected data for k < 0.19 
h/Mpc that 0.05 < T < 0.38 to 95% confidence [203] | ilt:i | . Similar results have 
been obtained from a similar recent likelihood analysis of the EDSGC survey 
angular power spectrum [331]. Figure 51 suggests that on very large scales 
(k < 0.05 h/Mpc), the APM data might show an indication of a break in the 
power spectrum [253]. From the figure, it might seem as if this is a 3-sigma 
detection, but as mentioned above different points are not independent. An¬ 
alytical studies, using different approximations to account for the covariance 
matrix between different band powers, indicate that this might be only a 1-cr 
result [204,203ir m ]. 

The above results on the shape parameter of the power spectrum have been 
confirmed by analyses of redshift catalogues as will be discussed in Sect. 8.3.2, 
and will soon be refined by measurements in large ongoing surveys such as the 
2dFGRS or the SDSS (Sect. 8.4). 

On smaller scales, a detailed study [260] of the reconstructed 3D 2-point 
correlation function in the APM [29] shows an inflection point in the shape 
of £ 2 %) at the transition to the non-linear scale r ~ ro ~ 5 Mpc/h, very much 
as expected from gravitational instability (see Sect. 4.5.2). 


the higher-order moments predicted by evolving the linear spectrum in Eq. (627) 
are in strong disagreement with the APM measurements at scales R < 10 Mpc/h 
(see Fig. 54 below), indicating that galaxy biasing is operating at non-linear scales. 
On the other hand, the large-scale correlations (R > 10 Mpc/h) are consistent with 
no significant biasing, see Sect. 8.2.6. 

115 In fact, it has been demonstrated in [222] that the one-loop PT predictions 
presented in Sect. 4.2.2 work very well for this spectrum on scales where the fit in 
Eq. (627) is valid, k < 0.6 h/Mpc. 

116 In addition, it was shown that galactic extinction, as traced by the maps in [555], 
had little effect on the power spectrum over the APM area with 6 < —20°. 

11 ' However, the initial suggestion by [253] for a break in the APM was confirmed 
with realistic numerical simulations which show that a mock galaxy catalog as big 
as the APM can be use to recover such a break when placed at different scales (see 
Fig. 11-12 in [253]). The level of significance for this detection was not studied, so 
these apparently discrepant analyses require further investigation. 
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Table 15 

The angular three and four-point amplitudes 3^3 and 16c/4 = 12r a ±4rfc, at physical 
scales (in Mpc/h) specified in the third column by VO. The last five entries corre¬ 
spond to the newer generation of galaxy catalogs (see Table 14). Error bars should 
be considered only as rough estimates, see text for discussion. 


3</3 

16(74 

VO 

Sample 

Year 

Ref. 

Estimator 

1.9 ±0.3 

— 

0.4-1.2 

Jagellonian 

1975 

[505] 

cumulant corr. 

3.5 ±0.4 

— 

0.1-4 

Zwicky (-Coma) 

1975 

[504] 

multiplet counts 

5.3 ±0.9 

— 

0.1-4 

Zwicky 

1977 

[275] 

55 

— 

100 ± 18 

0.1-2 

Zwicky 

1978 

[226] 

55 

4.7 ±0.7 

— 

0.3-10 

Lick 

1977 

[275] 

cumulant corr. 

— 

77 ±7 

0.5-4 

Lick 

1978 

[226] 

55 

4.8 ±0.1 

40 ±3 

0.3-5 

Lick 

1992 

[618] 

55 

~ 3 

— 

0.3-5 (fc) 

Lick 

1982 

[229] 

bispectrum 

2.7 ±0.1 

— 

0.2-2 

ESO-Uppsala 

1991 

[345] 

multiplet counts 

5.4 ±0.1 

— 

0.2-2 

UGC 

1991 

[345] 

55 

3.8 ±0.3 

35 ± 10 

4-20 

IRAS 1.2Jy 

1992 

[451] 

cumulant corr. 

3.5 ±0.1 

31 ± 1 

0.5-50 

APM (17-20) 

1995 

[620] 

55 

3.9 ±0.6 

— 

4 

APM 

1999 

[225] 

55 

2-6 

— 

4-30 

APM 

1999 

[225] 

55 

1.5-3 

— 

0.2-3 

LCRS 

1998 

[347] 

multiplet counts 

8-3 

— 

0.5-3 

DeepRange 

2000 

[635] 

55 

2 - 1 

— 

3-6 

DeepRange 

2000 

[635] 

55 

5 - 1 

— 

0.5-20 

SDSS 

2001 

[261,262] 

55 


8.2.4 Three-Point Statistics and Higher Order 

Angular surveys provide at present the best observational constraints on higher- 
order correlation functions in the non-linear regime. I 11 most cases, however, a 
detailed exploration of the different configurations available in three-point and 
higher-order correlations has not been given, due to limitations in signal to 
nois This will have to await the next generation of photometric surveys 
(e.g. SDSS [699] and DPOSS [187]). 

Table 15 summarizes the measurements achieved in various surveys. As can 
be seen in third column of Table 15, the limited size of surveys means that 

118 In addition, even with the currently available computational power and fast 
algorithms relying on e.g. AD-tree techniques [461], measuring directly higher-order 
correlation functions can be very computationally intensive. 
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most of the measurements only probed the nonlinear regime, except those 
done in the IRAS and APM catalogs. The first measurements of the three- 
point angular correlation function w 3 in the Jagellonian held [505], Lick and 
Zwicky surveys [275] established that at small scales the hierarchical model 
(see Sect. 4.5.5) gives a good description of the data, 


w 3 (9i, #2,6*3) = q 3 w 2 (0i)w 2 (0 2 ) + w 2 (9 2 )w 2 (9 3 ) + w 2 (0 3 )w 2 (0i 


(628) 


where q 3 is a constant of order unity with little dependence on scale or configu¬ 
ration (within the large error bars) at the range of scales probed. In addition, 
the four-point function was found to be consistent in the Lick and Zwicky 
catalogs with the hierarchical relation, 


ic 4 (l, 2, 3,4) = r a 


w 2 (l, 2) w 2 ( 2, 3) w 2 { 3,4) + eye. (12 terms) 


+r b 


w 2 (l, 2) w 2 ( 1, 3) w 2 (l, 4) + eye. (4 terms) , 


(629) 


where w 2 (i,j ) = w 2 (9ij ) with 9 tJ being the angular separation between points 
i and j. The amplitudes r a and r b correspond to the different topologies of 
the two type of tree diagrams connecting the four points (see e.g. Fig. 6 and 
discussion in Sect. 4.5.5), the so-called snake (r a , first diagram in Fig. 6) and 
star diagrams (r b , second diagram in Fig. 6). The overall amplitude of the four- 
point function is thus 16g 4 = 12r a + 4 r b , which we quote in Table 15, together 
with the three-point amplitude 3 q 3 . These are useful to compare with the 
angular skewness and kurtosis in Table 16 discussed in Sect. 8.2.5 because in 
the hierarchical model sjv — N N ~ 2 q jv to very good accuracy] 1 1 a [ I 11 addition, 
as discussed in Sect. 7.2.3, the q n coefficients are very weakly dependent on 
details of the survey such as the selection function and its uncertainties, so it 
is meaningful to compare q n from different galaxy surveys. 

In the first and second column of Table 15, in addition to the numerical val¬ 
ues of q 3 and g 4 , we quote as well the error on the estimate calculated by the 
authors. Except when noted otherwise, errorbars were obtained from the dis¬ 
persion in different zones of the catalog. Since typically the number of zones 
used is small (about four in most cases), the estimated errors are very un¬ 
certain^]]. I 11 addition, this method obviously cannot estimate the cosmic 
variance, which can be a substantial contribution for surveys with small area. 
Many, if not most, of the differences between the various numerical values 
given in Table 15 can be explained by statistical fluctuations and systemat- 
ics due to the finiteness of the catalogs [328] (see Chapter 6 for a detailed 
discussion of these issues), as we know briefly discuss. 

119 And similarly in the 3D case, see [85,249] for accurate estimates of the small 
corrections to this relation. 

120 However, as discussed in end of Sect. 6.4.3 for the two-point correlation function, 
when the number of subsamples is large, this method tends to overestimate the real 
cosmic errors. 
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The results of q% in the Zwicky catalog do not seem to be very reliable since 
the value found in [504] changed by more that 50% due to the omission of only 
14 galaxies in the Coma cluster (see [275]). Similar effects have been found 
in other samples (e.g. ESO-Uppsala [345]). This sensitivity reflects that the 
sample is not large enough to provide a fair estimate of higher-order statis¬ 
tics. Likewise, the rather low value for q 3 found in the Jagellonian sample is 
likely strongly affected by finite volume effects due to the small area covered. 
Similarly, the values obtained from the projected LCRS by [347] could be 
partially contaminated by edge effects due to the particular geometry of the 
catalog (6 strips of 1.5 x 80 degrees) and perhaps also by sampling biases due 
to inhomogeneous sampling around high density regions T m ~| . 

Work has been done as well to study the dependence of q 3 on morphological 
type, but dividing the data in smaller subsamples tends to produce stronger 
statistical biases. In the ESO-Uppsala and UGC catalogs, [345] found that spi¬ 
rals have significantly smaller values of q 3 . This could be interpreted through 
the well-known density-morphology relation [192,529]: spirals avoid rich clus¬ 
ters and groups, an effect that could be more important at smaller scales (this 
is illustrated to some extent in Fig. 45). The results for the full sample in the 
ESO-Uppsala and UGC catalogs showed good agreement with the hierarchical 
scaling (note however that error bars quoted in this case are just due to the 
dispersion in the fit to the hierarchical model rather than reflecting sample 
variance). 

The measurements of the three-point correlation function in the Lick survey 
did not show any strong evidence for a dependence of q 3 on the shape of the 
triangle, although a marginal trend was found that colinear triangles had a 
higher q 3 than isosceles [275]. The three-point statistics was analyzed in terms 
of the bispectrum in [229], who found the same amplitude for q 3 than in real 
space, but some indications of a scale dependence beyond the hierarchical 
scaling, with q 3 increasing as a function of wavenumber k with a peak corre¬ 
sponding to the angular scale (2.5°) of the break in tc 2 (0), and then decreasing 
again at large k. A later re-analysis of the large-scale Lick bispectrum [236] 
showed a marginal indication of dependence on configuration shape, too small 
compared to the one expected in tree-level PT, and thus in principle an indi¬ 
cation of a large galaxy bias [see Eq. (528)]. However, the scales involved were 
not safely into the weakly non-linear regime and thus this result is likely the 
effect of non-linear evolution rather than a large galaxy bias [560]. 

The four-point function measurements in the Lick survey were not able to test 
the relation in Eq. (629) in much detail, but assuming Eq. (629) measurements 
for some specific configurations (such as squares and lines) gave a constraint 
on the amplitudes r a and which were then translated into a constraint on 
3D amplitudes by deprojection (see Sect. 7.2.3), resulting in R a = 2.5 ± 0.6 


121 Due to the fixed number of fibers per field and “fiber collisions”. Using random 
catalog generation [347] checked that these effects appeared to be insignificant. 
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and Rb = 4.3 ± 1.2 [226]. These results on the Lick survey were considerably 
extended in [618] to higher-order q N 's up to IV = 8 in the context of the 
degenerate hierarchical model p 22 ] , by using two-point moment correlators ! 123 1 . 
This confirmed the hierarchical scaling w n ~ up to N = 8 , at least 

for these configurations, with q n ~ 1-2Q 

The same technique was applied to the IRAS 1.2 Jy survey in [451], verifying 
the hierarchical scaling for N — 3,4 but with q^s with N > 4 being con¬ 
sistent with zero, and also to the APM survey [620] which showed non-zero 
amplitudes up to N = 6 , with a trend of increasing q n as a function of N, 
i.e. qN = 1.2,2,5.3,10 for N = 3,4, 5, 6 , unlike the case of the Lick catalog. 
The APM survey was later re-analyzed in terms of cumulant correlators [e.g. 
see Eq. (348)] in [623], showing hierarchical scaling for N = 4, 5 to within 
a factor of two | lZi> \ In addition, it showed that at scales 6 > 3.5 degrees the 
factorization property predicted by PT, Eq. (349), starts to hold. By measur¬ 
ing (Sf52 ) c and (<5 2 <5| ) c and assuming the hierarchical model as in Eq. (629) 
it was possible to constrain (after deprojection) R a ~ 0.8 and Rb — 3.7, in 
reasonable agreement with the Lick results [226] mentioned in the previous 
paragraph. These imply an average q 4 ~ 2.2. 

The analysis of the three-point function in the DeepRange survey [635] shows 
a general agreement with the hierarchical model with large errors in q 3l with 
a consistent decrease as a function of depth. Indeed, a fit to the hierarchical 
model, Eq. (628) gives q 3 = 1.76,1.39,2.80,1.00,0.34,0.57 for I-band magni¬ 
tudes I = 17 — 18,18 — 19,19 — 20, 20 — 21, 21 — 22, 22 — 22.5, respectively. 
This trend is also present in the count-in-cells measurements and, if confirmed 
in other surveys, have interesting implications for the evolution of galaxy bias 
(see Fig. 55). Note that in this work errors were estimated using the FORCE 
code [621,152,630], which is based on the full theory of cosmic errors as de¬ 
scribed in Chapter 6 . 


122 In this case all amplitudes corresponding to different tree topologies are assumed 
to have the same amplitude qjy, and thus R a = Rb, etc., see Sect. 4.5.5. 

123 In the same spirit, it is worth noticing that four-point correlation function esti¬ 
mates for particular configurations can be obtained through measurements of the 
dispersion of the two-point correlation function over subsamples (or cells) extracted 
from the catalog (see [83,230]): this is a natural consequence of the theory of cosmic 
errors on W 2 detailed in Sect. 6.4.3. This method has the potential defect of being 
sensitive to possible artificial large scale gradients in the catalog. 

124 Note however that the errors quoted by the authors come from a fitting procedure, 
not sampling variance. For N > 6 correlations are consistent with zero when using 
the sampling variance among twelve zones. 

125 The scales probed in this case, 0.8° < 9 < 4.5°, are in the transition to the 
nonlinear regime (1 degree corresponds to about 7 Mpc/h at the APM depth), so 
it is not expected to show hierarchical scaling. On the other hand, galaxy biasing 
might help make correlations look more hierarchical, as illustrated in Fig. 45 by the 
suppression in the growth of S p parameters as small scales are probed. 
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Fig. 52. The angular three-point amplitude ( 73 (a) from PT prediction (thick contin¬ 
uous line) compared with the APM measurements at 8 12 = 8 13 = 2°: closed squares 
and open circles correspond to the full APM map and to the mean of 4 disjoint 
zones. Other curves show results for each of the zones (from [225]). 

Some of the analyses above probed the weakly nonlinear regime, where the s 
are expected to show a characteristic angular dependence predicted by PT, 
even after projection from 3D to angular space [242,225,101]. Measurements of 
q 3 in the Lick catalog showed a marginal indication that colinear configurations 
are preferred compared to isosceles triangles [275,236] (but see [229]). Pro¬ 
jecting the three-point function in redshift space from the LCRS survey, [347] 
found a marginal enhancement for colinear triangles, but the scales probed 
(r < 12 Mpc/h) are not safely in the weakly non-linear regime. 

For angular catalogs, the APM survey presents the best available sample to 
check the angular dependence of q 3 predicted by PT [225]. Figures 52 and 53 
show the measurements of q 3 (a) in the APM survey at 9 12 = #13 = 0.5 — 4.5 
degrees estimated by counting pairs and triplets of cells of a given angular 
configuration, see Sect. 6 . 8 . Closed squares correspond to estimations in the 
full APM map, while open circles are the mean of q 3 estimated in 4 disjoint 
zones. The value of 3q 3 ~ 3.9±0.6 at a ~ 0, shown in Table 15, is in agreement 
with the cumulant correlators estimated (with 4x4 bigger pixels) in [628]. 
Furthermore, the average over a is comparable to the values of S 3/3 in Table 15 
and in particular to the APM and EDSGC estimations [249,451,627]. 

Figure 52 shows the results for individual zones in the APM (same as the ones 
in Fig. 50) for all triangles with d 12 = 9 13 = 2 degrees. These estimations of 
q 3 are subject to larger hnite-volume effects, because each zone is only 1/4 
the size of the full APAlf 7 ^ . As in Fig. 50, there is a strong covariance among 

126 The fact that the average over the four zones (open circles) is not equal to the 
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Fig. 53. The projected three-point amplitude q 3 in PT (solid curves) and N-body 
results (open triangles with errorbars) for the APM-like power spectrum are com¬ 
pared with q 3 measured in the APM survey (closed squares and open circles, with 
same meanings as in Fig. 1). Each panel shows the amplitude at different 8 12 = # 13 . 
In upper right panel, dotted and dashed curves correspond to PT predictions with 
b\ = 1, 62 = —0.5 and b\ = 2, 62 = 0, respectively. In the lower left panel, upper and 
lower solid curves conservatively bracket the uncertainties in the inferred APM-like 
power spectrum, long-dashed curve corresponds to SCDM, and the dotted curve 
shows the leading-order prediction for the y 2 non-Gaussian model. 

the estimations in different zones, which results in a large uncertainty for the 
overall amplitude < 73 . Because the zones cover a range of galactic latitude, a 
number of the systematic errors in the APM catalog (star-galaxy separation, 
obscuration by the galaxy, plate matching errors) might be expected to vary 
from zone to zone. No evidence for such systematic variation is found in q 3 : the 
scatter in individual zone values are compatible with the sampling variance 
observed in N-body simulations [225]. On larger scales, 9 > 3 degrees, the 
individual zone amplitudes exhibit large variance, and in addition boundary 
effects come into play. As seen in Fig. 53 at these scales q 3 is consistent with 


measurement in the full APM map is a manifestation of estimator bias [328,630]. 
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zero within the errors. 

The APM results are compared with the values of g 3 predicted by PT with the 
linear APM-like spectrum in Eq.(627) (solid curves) and with measurements 
in N-body simulations (open triangles with errorbars) with Gaussian initial 
conditions corresponding to the same initial spectrum. Since the APM-like 
model has, by construction, the same w{6) as the real APM map, it is assumed 
that the sampling errors are similar in the APM and in the simulations. This 
might not be true on the largest scales, where systematics in both the APM 
survey and the simulations (periodic boundaries) are more important. 

At scales 9 > 1 degree, the agreement between the APM-like model and the 
APM survey is quite good; this corresponds roughly to physical scales r > 7 
h _1 Mpc, not far from the non-linear scale (r 0 — 5, where £ 2 — 1)- Also note 
that the q 3 predicted in the SCDM model (dashed curve in lower-left panel 
of Fig. 53) clearly disagrees with the APM data; this conclusion is indepen¬ 
dent of the power spectrum normalization and it is therefore complementary 
to the evidence presented by two-point statistics [422,200] (see discussion in 
Sects. 8.2.2 and 8.2.3). At smaller angles, 6 < 1 deg, g 3 in the simulations 
is larger than in either the real APM or PT (top-left panel in Fig. 53). The 
discrepancy between simulations and PT on these relatively small scales is due 
to non-linear evolution. The reason for the discrepancy with the real APM is 
probably an indication of galaxy biasing at small scales: this will affect the 
inference of the linear power spectrum from the data [30] and also suppress 
higher-order correlations compared to the dark matter [570] (see e.g. Fig. 45 
and discussion in Sect. 7.1.3). 

8.2.5 Skewness, Kurtosis and Higher-Order Cumulants 

Table 16 shows the results for the skewness (s 3 ) and kurtosis (s 4 ) in several 

of the angular catalogs described in Sect. 8.2.1. 

The analysis of the Zwicky sample by [583] used moments of counts in cells to 
estimate the hierarchical amplitudes q n, assuming the degenerate hierarchical 
model in Sect. 4.5.5. Because counts in cells were used, the measurement is 
closer to s 3 than to g 3 . As noted in Sect. 8.2.4, the Zwicky catalog has been 
shown to be sensitive to a few galaxies in the Coma cluster, a signature that the 
survey is not large enough to be a fair sample for the estimation of higher-order 
moments. Indeed, in [583] it was found that the mean over a four-subsample 
split changed from the values in Table 16 to s 3 = 4.2 ± 0.9 and s 4 = —7 ± 12, 
a manifestation of the estimation biases discussed in Chapter 6. 

In [224] angular positions from volume limited subsamples of redshift cata¬ 
logs (CfA, SSRS and IRAS 1.9 Jy) were used to estimate the angular 1110 - 
ments f T?7 1 . Note for example how the values for s 3 and s 4 in the CfA sur¬ 
vey from these smaller samples are lower than in the parent Zwicky sam- 


12 ' The values in Table 16, from Table 8 in [224], have been multiplied by r 3 ~ 1.2 
and r 4 ~ 1.5 for a direct comparison in angular space. 
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Table 16 

The reduced skewness and kurtosis from counts-in-cells in angular space. In most 
cases, only the mean values over a range of scales were published. In cases where 
measurements of the individual s p for each smoothing scale are reported in the 
literature, we quote the actual range and the corresponding range of scales. Error 
bars should be considered only as rough estimates, see text for discussion. 


■S3 

s 4 

ve 

Sample 

Year 

Ref. 

2.9 ±0.9 

12 ±4 

1-8 

Zwicky 

1984 

[583] 

2.4 ± 0.4 

9.5 ± 2.4 

2-20 

CfA 

1994 

[224] 

2.2 ±0.3 

8 ± 3 

2-20 

SSRS 

1994 

[224] 

2.5 ±0.4 

11 ±3 

2-20 

IRAS 1.9Jy 

1994 

[224] 

3.8 ±0.1 

33 ±4 

7-30 

APM (17-20) 

1994 

[249] 

5.0 ±0.1 

59 ±3 

0.3-2 

APM (17-20) 

1994 

[249] 

7-4 

170 - 40 

0.1-14 

EDSGC 

1996 

[622] 

3.0 ±0.3 

20 ±5 

0.1 

APM (17-20) 

1998 

[627] 

6-2 

120 - 10 

0.1-6 

DeepRange 

2000 

[635] 

5-2 

100-20 

0.5-20 

SDSS 

2001 

[261,262,637] 


pie. This suggests again that there are significant systematic finite-volume 
effects [249,621,328,630], 

Figure 54 shows S 3 (filled triangles) and S 4 (filled squares) measured in the 
APM survey [249]. The open figures with errorbars correspond to the mean of 
20 N-body all-sky simulations presented in [254] with the linear “APM-like” 
power spectrum in Eq. (627), with 1-er error-bars scaled to the size of the 
APM | The continuous line show the tree-level PT results of [48] numeri¬ 
cally integrated for the APM-like power spectrum, as described in [254] | 120 1 . 
The uncertainties in the shape of the power spectrum and the evolution of 
clustering of APM galaxies are comparable or smaller than the simulation 
error bars [250]. 

As can be seen in Fig. 54, APM measurements are somewhat below the PT 
predictions or A r -body results at 9 > 1 degrees [48], indicating possibly a slight 
bias for APM galaxies. But note that this difference is not very significant 
given the errors and the fact that there is a strong covariance and a significant 


128 These errors should be considered more realistic than those given in the fifth 
and sixth entry in Table 16, which were derived by combining results at different 
angular scales assuming they are uncorrelated [249]. These errorbars also correspond 
roughly to a 2 -a confidence in a single all-sky map: they are twice as large as the 
ones in Fig.47. 

129 See e.g. Eq. (587) and Sect. 7.2.4 for a discussion of projection in the weakly 
non-linear regime. 
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<9(deg) 

Fig. 54. The angular skewness, S 3 , and kurtosis, S 4 , from the APM catalog (filled 
triangles and squares) as compared with PT results (continuous line) and APM-like 
all-sky N-body simulations (open triangles and squares). 

negative bias on these scales (see section 4.1 in [254]). At smaller angles, 9 < 1 
degree, the N-body results are clearly higher than either PT (due to non-linear 
evolution) or the real APM results (see also top-left panel in Fig. 53 for the 
corresponding result for the three-point function). The latter is likely due to 
galaxy biasing operating at small scales [30], as discussed in the last section. 
Estimation of higher-order moments from the EDSGC [622] up to p = 8 are 
in good agreement within the errors with APM on scales 9 > 0.1 degrees. On 
smaller scales, 9 < 0.1 degrees, the EDSGC estimates are significantly larger 
than the APM values, indicating systematic problems in the deblending of 
crowded fields [627] | 130 1 . The DeepRange results [635] for the corresponding 
APM slice ( Iab = 17 — 18) give values of S 3 and S 4 which are intermediate 
between the APM and the EDSGC. This is also the case for the R INT-WFC 
catalog [540]. At larger scales, on the other hand, they both give slightly 
smaller results. This is not a very significant deviation but might indicate 
that the DeepRange survey is not large enough at this bright end and it 
therefore suffers from the same biases that are apparent when the APM S p 

130 Measurements in this paper were done with an infinite oversampling tech¬ 
nique [625]. In general, results without significant oversampling could underestimate 
S p (see also [328]) but this does not explain the difference with the APM analysis, 
where the oversampling was adequate. 
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estimations are split in its 6 x 6 square degree plates. For the fainter slices 
the Deep Range results are less subject to volume effects and seem to indicate 
smaller values of S 3 and S 4 [635] as a function of depth (see Fig. 55 below). 
Finally, we note also that the skewness has been estimated for radio sources 
in the FIRST survey [426] (see also [165] for measurements of the angular 
correlation function), giving values S 3 = 1 — 9 for a depth corresponding to 
1-50 Mpc/h, approximately. 

8.2.6 Constraints on Biasing and Primordial Non-Gaussianity 
Galaxy biasing and primordial non-Gaussianity can leave significant imprints 
in the structure of the correlation hierarchy, as discussed in detail in Sect. 7.1 
and Sects. 4.4 and 5.6, respectively. These effects are best understood at large 
scales, where PT applies and simple arguments such as local galaxy biasing 
(see e.g. Sect. 7.1.1) are expected to hold. The APM survey is at present 
the largest angular survey probing scales in the weakly non-linear regime, 
thus most constraints on biasing and primordial non-Gaussianity from angu¬ 
lar clustering have been derived from it. For constraints derived from galaxy 
redshift surveys see Sect. 8.3.5. 

The lower-left panel in Fig. 53 shows the linear prediction (dotted lines), 
corresponding to the projection of Eq. (186) [514], for y 2 initial conditions 
(see Sect. 4.4) with the APM-like initial spectrum [225]. Although the error 
bars are large and highly correlated, the projected three-point function for 
this model is substantially larger than that of the APM measurements and 
the corresponding Gaussian model for intermediate a. This may seem only a 
qualitative comparison, since as discussed in Sect. 4.4, non-linear corrections 
for this model are very significant even at large scales. However, non-linear 
corrections lead to even more disagreement with the data: although the shape 
dependence resembles that of the Gaussian case, the amplitude of q 3 when 
non-linear corrections are included becomes even larger than the linear result, 
especially at colinear configurations (see Fig. 17). 

This is also in agreement with [252], who used the deprojected S p from the 
APM survey [249] to constrain non-Gaussian initial conditions from texture 
topological defects [655] which, as in the case of the y 2 model, also have 
dimensional scaling £jv rsj B N g /2 , with B 3 ps £? 4 « 0.5 (see Fig. 29). In this 
case it was found that N-body simulations of texture-type initial conditions 
lead to a significant rise at large scales in the S p parameters not seen in the 
APM data, even when including linear and non-linear (local) bias to match 
the amplitude of S p at some scale. 

Constraints on a non-local biasing model from the APM S p parameters were 
considered in [248]. The model of cooperative galaxy formation [96], where 
galaxy formation is enhanced by the presence of nearby galaxies, was suggested 
to produce a scale-dependent bias to create additional large-scale power in the 
standard CDM model and thus match the APM angular correlation function. 
However, the effect of this scale-dependence bias is to imprint a significant scale 
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Fig. 55. The solid symbols display s 3 measured at 0.04° for 6 magnitude slices 
(. Iab = 17—18,18 — 19,19 — 20,20 — 21,21 — 22,22 — 22.5, corresponding to increasing 
mean redshift) of the DeepRange catalog (from [635]). Each value of S 3 is plotted 
at the median z of the slice. The shaded band shows the predictions of a model of 
galaxy bias evolution, see text for details. The right-shifted error bars for the two 
faintest measurements include errors due to star/galaxy separation [635]. 

dependence on the S p parameters that is ruled out by the APM measurements 
(see also Fig. 57 below). 

The upper right panel in Fig. 53 shows the PT predictions for the APM- 
like initial power spectrum, Eq. (627), with linear bias parameter b\ = 2 
(dashed curve) and a non-linear (local) bias model [see e.g. Eq. (525] with 
bi — 1, b 2 — —0.5 (dotted curve). Even if the errors are 100% correlated, 
these models are in disagreement with the APM data. A more quantitative 
statement cannot be made about constraints of bias parameters from the APM 
higher-order moments since a detailed analysis of the covariance matrix is 
required. However, for linear bias the measurements imply that APM galaxies 
are unbiased to within 20 — 30% [225]. These constraints agree well with the 
biasing constraints obtained from the inflection point of the reconstructed 
£ 2 ( 7 ") in the APM [260]. On the other hand, consideration of non-linear biasing 
can open up a wider range of acceptable linear bias parameters [224,248,671]. 
An alternative to wide surveys which probe the weakly non-linear regime at 
recent times, deep galaxy surveys can probe the redshift evolution and also 
reach weakly non-linear scales at high redshift. Although presently this is not 
possible due to the small size of current deep surveys, it will become so in 
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the near future. An early application along these lines is in Fig. 55, which 
shows the redshift evolution of S 3 for measurements of [635] in the DeepRange 
catalog at a fixed angular scale of 0.04°. This corresponds to about 0.3 h~ l Mpc 
at z ~ 0.15 and 1.5 h Mpc at z ~ 0.75, so the scales involved are in the non¬ 
linear regimc p 3r | . 

The redshift evolution in Fig. 55 is just the opposite of that expected in generic 
(dimensional) non-Gaussian models, where the skewness S 3 should increase 
with redshift (see e.g. discussion in Sect. 5.6). However, since these scales 
are in the non-linear regime the predictions based on PT cannot be safely 
used, and galaxy biasing can behave in a more complicated way. I 11 any case, 
the trend shown in Fig. 55 can be matched by a model, shown in a shaded 
band, where S 3 (z) = 63 ( 0 ) (1 + ^) -0,5 [635], which may indicate that galaxy 
bias is increasing with redshift, as expected in standard scenarios of galaxy 
formation (see discussion in Sect. 7.1), and contrary to the evolution expected 
from strongly non-Gaussian initial conditions. A more quantitative constraint 
will have to await the completion of future deep surveys that can probe the 
weakly non-linear regime. 

8.3 Results from Redshift Galaxy Surveys 
8 .3.1 Redshift Catalogs 

Redshift surveys map the three-dimensional distribution of galaxies in a large 
volume, and are thus ideally suited to use higher-order statistics to probe 
galaxy biasing and primordial non-Gaussianity. Table 17 shows a list of the 
main wide-held redshift catalogs . For a more general review on redshift cat¬ 
alogs see [484,268,607,608]. 

Redshift surveys require a predefined sample of targets to obtain redshifts, 
therefore they are often defined from angular surveys where galaxies are de¬ 
tected photometrically. Below we shortly discuss the main characteristics of 
the surveys in Table 17, for a brief description of the photometric parent cat¬ 
alogs see Sect. 8 . 2 . 1 . 

The Center for Astrophysics survey (hereafter CfA, [324]) and the Perseus- 
Pisces redshift Survey ( PPS . [268]) are both based on the Zwicky catalog. 
The CfA survey, perhaps the most analyzed redshift survey in the literature, 
consists of 2417 galaxies with Zwicky magnitudes less than 14.5, covering over 
2.67 ster (1.8 ster in the North Galactic cap) with a median redshift corre¬ 
sponding to 3300 km/sec. The PPS survey, centered around the Perseus-Pisces 
supercluster, contains over 3000 galaxies. The Southern Sky Redshift Survey 
(hereafter SSRS, [168]) is based on the ESO/Uppsala angular sample, and 
contains about 2000 galaxies. These surveys suffer from the same calibration 


131 Although a fixed angular scale does not correspond to a fixed spatial scale as 
a function of z, the comparison is meaningful because the measured s 3 (0) is scale 
independent (hierarchical) to a good approximation. 
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Table 17 

Optical and infrared (last four) redshift catalogs. The survey area D is given in 


stereo-radians, the 

depth and 

effective size T>e 

= (U/47r) 1 / 3 2U 

are in Mpc/h. 


Name 

Area D 

magnitudes 

Depth V 

V E 

# gal/ster 

Ref 

CfA 

1.8 ster 

mz < 14.5 

50 

52 

~ 1000 

[324] 

SSRS 

1.8 ster 

D( 0) > 0.1 

50 

52 

~ 1000 

[168] 

PPS 

~ 1 ster 

m > 15.5 — 15 

80 

70 

~ 3000 

[268] 

LCRS 

0.02 ster 

R < 17.8 

300 

70 

1.3 x 10 6 

[584] 

Stromlo-APM 

1.3 ster 

bj < 17.15 

150 

140 

1400 

[411] 

Durham/UKST 

0.45 ster 

bj < 17 

140 

90 

5500 

[535] 

2dFGRS 

0.6 ster 

bj < 19.5 

300 

220 

~ 2.5 x 10 5 

[481] 

SDSS 

~ 3 ster 

r' < 18 

275 

341 

~ 10 6 

[704] 

QDOT 

10 ster 

fsOfim ^ 0-QJy 

90 

170 

245 

[200] 

IRAS 1.9Jy 

9.5 ster 

> 1-9 Jy 

60 

110 

220 

[606] 

IRAS 1.2Jy 

9.5 ster 

> 1-2 Jy 

80 

145 

480 

[218] 

PSCz 

10.5 ster 

f&Opum ^ 0.6<7i/ 

100 

188 

1470 

[550] 

problems as their 

parent catalogs, but with redshift 

information they 

were 


aimed to represent a fair sample of the universe. Recent extensions of these 
surveys to deeper magnitudes (m < 15.5, 2000 redshift, D ~ 80 Mpc/h) are 
denoted by CfA2 and SSRS2 and have been merged into the Updated Zwicky 
Catalog (UZC, [208]). 

The LCRS [584], consists of redshifts selected from a well calibrated CCD 
survey of 6 narrow 1.5 x 80 degrees strips in the sky. Although this survey is 
much deeper and better calibrated than any of the previous ones, it is also 
potentially subject to important selection and boundary effects: narrow slices, 
density-dependent sampling (because of a constant number of fibers per field) 
and the exclusion of galaxies closer than 55". All these effects tend to un¬ 
derweight clusters and, even if properly corrected, could introduce important 
sampling biases in higher-order statistics ) | . 


132 For example, it is impossible to recover any lost configuration dependence of 
correlation functions in the non-linear regime by a correction procedure, since the 
correcting weight for lost galaxies would have to decide whether they were aligned 
or isotropically distributed. 
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The Stromlo-APM redshift survey ([411]) consists of 1790 galaxies with bj < 
17.15 selected randomly at a rate of 1 in 20 from APM scans in the south 
Galactic cap. The Durham/UKST galaxy redshift survey ([535]) consists of 
2500 galaxy redshifts to a limiting apparent magnitude of bj = 17, covering 
a 1500 sq deg area around the south galactic Pole. The galaxies in this sur¬ 
vey were selected from the EDSGC and were sampled, in order of apparent 
magnitude, at a rate of one galaxy in every three. 

The IRAS Point Source Redshift Catalog (hereafter PSCz . [550]) is based on 
the IRAS Point Source Catalog (see [123]), with several small additions applied 
to achieve the best possible uniformity over the sky. The survey objective was 
to get a redshift for every IRAS galaxy with 60 micron flux / 60 > 0.6 Jy, 
over as much of the sky as possible. Sky coverage is about 84% with 15411 
galaxies. Earlier subsamples of PSCz include the updated QDOT catalog [200], 
the IRAS 1.9Jy. [606] and the IRAS 1.2Jy. [218] redshift surveys. The QDOT 
survey chooses at random one in six galaxies from PSCz, leading to 1824 
galaxies with galactic latitude |6| > 10°. The other subsamples are shallower 
but denser than QDOT; the 2Jy catalog, complete to a flux limit / 60 > 2Jy. 
contains 2072 galaxies, whereas the 1.2Jy. catalog, with / 60 > 1.2 contains 
4545 galaxies. IRAS galaxies are mostly biased towards spiral galaxies which 
tend to undersample rich clusters. Thus IRAS galaxies are both sparser and 
a biased sample of the whole galaxy population. 

The Sloan Digital Sky Survey (SDSS, see e.g. [699]) and the two degree field 
2dF Galaxy Redshift Survey (2dFGRS, see [142]) were still under construction 
when this review was written and only preliminary results are known at this 
stage. These results are discussed in section 8.4. 

Other recent redshift surveys for which there is not yet measurements of 
higher-order statistics include the Canada-France Redshift survey [401], the 
Century survey [266], the ESO Slice Project [673], the Updated Zwicky Cat¬ 
alog [208] and the CNOC2 Field galaxy survey [113]. 

8.3.2 Two-Point Statistics 

We now briefly discuss results on two-point statistics from redshift surveys, 
with emphasis on the power spectrum. We first address optical surveys and 
then infrared surveys. 

The analysis of the redshift-space correlation function in the CfA survey [172] 
found that, after integration over the parallel direction to project out redshift 
distortions, the resulting two-point function agreed with that derived from in¬ 
version in angular catalogs, Eq. (626), with 7 ~ 1.77 and r 0 = 5.4±0.3 Mpc/h, 
for projected separations r p < 10 Mpc/h. At larger scales, the redshift-space 
correlation function estimates become steeper and there was marginal evi¬ 
dence for a zero crossing at scales larger than about 20 Mpc/h. ra Modeling 

133 The measured redshift 2-point function will be found to be flatter than the real 
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the redshift-space correlation function as a convolution of the real-space one 
with an exponential pairwise velocity distribution function ! 144 1 with velocity 
dispersion a v , [172] obtained that cr v = 340 ± 40 km/sec at r p — 1 Mpc/h, 
well below the predictions of CDM models. 

These results were extended a decade later with the analysis of the power 
spectrum in the extension of the CfA survey to mz < 15.5. In [679] it was 
shown that, in agreement with previous results from the APM survey [422] 
and IRAS galaxies [200,549], the standard CDM model was inconsistent with 
the large-scale power spectrum at the 99% confidence level. In addition, [489] 
studied the relation between the real space and redshift space power spec¬ 
trum in CDM simulations, using the Eq. (617), and showed that agreement 
between the small-scale power spectrum and T = 0.2 CDM models required 
a velocity dispersion parameter a v ~ 450 km/sec, somewhat larger than the 
value obtained by modeling the two-point function in [172], A joint anal¬ 
ysis of the CfA/PPS power spectrum gave a best fit CDM shape parameter 
T = 0.34±0.1 [31]. Similarly, a joint analysis of the CfA/SSRS samples in [169] 
showed a power spectrum consistent with CDM models with T & 0.2 and bias 
within 20% of unity when normalized to COBE [596,694] CMB fluctuations 
at the largest scales. A recent analysis [487] of the redshift-space large-scale 
(k < 0.3 h/Mpc) power spectrum of the Updated Zwicky catalog [208], which 
includes CfA2 and SSRS, was done using the quadratic estimator and decor¬ 
relation techniques (see Sects. 6.11.2-6.11.4). The measurements in different 
subsamples are well fit by a ACDM model with normalization fqcrg = 1.2 —1.4. 
The analysis of the LCRS redshift space power spectrum was done in [403], 
where they used Lucy’s method [413] to deconvolve the effects of the window of 
the survey, which are significant given the nearly two-dimensional geometry. 
They obtained results which were consistent with previous analyses of the 
CfA2 and SSRS surveys. An alternative approach was carried out in [394], 
where they estimated the two-dimensional power spectrum, which was found 
to have a “bump” at k = 0.067 Mpc/h with amplitude a factor of « 1.8 larger 
than the smooth best fit P = 0.3 CDM model. This is reminiscent of similar 
features seen in narrow deep “pencil beams” redshift surveys, e.g. [99]p 


space one, with more power on large scales and less power on smaller scales, as 
expected from theory (see Sect.7.4), with evidence for a larger correlation length in 
redshift space, so > ro, in all CfA, SSRS and IRAS catalogues [239]. 

134 An exponential form was first suggested in [507] and has since been supported by 
observations, see e.g. [395] for a recent method applied the LCRS survey. The inter¬ 
pretation of this technique, however, rests on the assumption of a scale-independent 
velocity dispersion, which seems consistent in LCRS [349], but may not necessarily 
be true in general, see e.g. [298,351] for the PSCz survey. Theoretically, exponential 
distributions arise from summing over Gaussian distributions, both in the weakly 
and highly non-linear regimes, see [357] and [586,186] respectively. These results are 
also supported by N-body simulations [217,708]. 

135 See e.g. [363,488] and the recent analysis in [701] for a discussion of the statistical 
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A recent linear analysis of the LCRS survey [443] using the KL transform 
methods (see e.g. Sect. 6.11.4), parameterized the power spectrum in redshift 
space by a smooth CDM model, and obtained a shape parameter T = 0.16 ± 
0.10, and a normalization bi<j 8 = 0.79 ± 0.08. 

The two-point correlation function of LCRS galaxies was measured in [654,349], 
and integrated along the line of sight to give the projected correlation function 
in real space, which was found to agree with Eq. (626), with 7 ~ 1.86 ± 0.04 
and r 0 = 5.06 ±0.12 Mpc/li [349]. After modeling the pairwise velocity distri¬ 
bution function by an exponential with dispersion and mean (infall) velocity, 
the inferred pairwise velocity at IMpc/h was found to be a v = 570±80 km/sec, 
substantially higher compared to other surveys. I11 fact, another analysis of the 
LCRS survey in [395] found a pairwise velocity dispersion of a v = 363 ± 44 
km/sec, closer to previous estimates. In this case, the deconvolution of the 
small-scale redshift distortions was done by a Fourier transform technique, 
assuming a constant velocity dispersion and no infall [i.e. negligible U12, see 
Eq. (198)]. At least part of this disagreement can be traced to the effects of 
infall, as shown in [348]. For other recent methods and applications to de¬ 
termining the small-scale pairwise velocity dispersion and infall see e.g. [176] 
and [359], respectively. 

Results from the power spectrum of the Stromlo-APM survey [412,639], the 
Durham/UKST survey [318] and the ESO Slice Project [115] are in agreement 
with previous results from optically selected galaxies, and show an amplifica¬ 
tion compared to the power spectrum of IRAS galaxies implying a relative 
bias factor b ovt /b uas ps 1.2 — 1.3. This is reasonable, since IRAS galaxies are 
selected in the infrared and are mostly spiral galaxies which, from the observed 
morphology-density relation [192,529], tend to avoid clusters. We shall come 
back to this point when discussing higher-order statistics. 

The first measurements of counts-in-cells in the QDOT survey [200,549] showed 
that IRAS galaxies were more highly clustered at scales of 30-40 Mpc/h com¬ 
pared to the predictions of the standard CDM model, in agreement with the 
angular correlation function from APM [422], The QDOT power spectrum was 
later measured in [212] using minimum variance weighting, giving redshift- 
space values a 8 = 0.87 ± 0.07 and T = 0.19 ± 0.06. Measurement of the power 
spectrum of the 1.2Jy survey [215] confirmed and extended this result, al¬ 
though it showed somewhat less power at large scales than QDOT p^j . The 
measurement of the two-point function in redshift space for the 1.2Jy sur¬ 
vey [216,217] implied a real space correlation function as in Eq. (626), with 
7 ~ 1.66 and tq = 3.76 Mpc/h for scales r < 20 Mpc/h, consistent with the 
fact that IRAS galaxies are less clustered than optically selected galaxies. In 

significance of these features. 

136 It was later shown that the QDOT measurements were sensitive to a small num¬ 
ber of galaxies in the Hercules supercluster [202,638], which was over-represented 
in the QDOT sample presumably due to a statistical fluctuation in the random 
numbers used to construct the survey. 
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Hamilton & Tegmark (2001) 



Fig. 56. The real space power spectrum of PSCz galaxies. To the left of the vertical 
line is the linear measurement of [299] (points with uncorrelated errorbars [297]), 
while to the right is the nonlinear measurement of [298] (points with correlated error 
bars). The dashed line corresponds to the flat ACDM concordance model power 
spectrum from [650] with parameters as indicated, nonlinearly evolved according to 
the prescription in [494], (from [298]) 

addition, the inferred velocity dispersion at IMpc/h was a v = 317I49 km/sec. 
Measurements in the PSCz survey are currently the most accurate estimation 
of clustering of IRAS galaxies. At large scales, the power spectrum is interme¬ 
diate between that of QDOT and 1.2Jy surveys, whereas at smaller scales it 
decreases slightly more steeply [612], The shape of the large-scale power spec¬ 
trum is consistent with T = 0.2 CDM models, although it does not strongly 
rule out other models [612,640]. A comparison with the Stromlo-APM survey 
shows a relative bias parameter of &stromio/^PSCz ~ 1.3 and a correlation coef¬ 
ficient between optical and IRAS galaxies of R > 0.72 at the 95% confidence 
limit on scales of the order of 20 Mpc/h [573]. These results were considerably 
extended in [298] to obtain the power spectrum in real space by measuring 
the redshift-space power perpendicular to the line of sight and parameterizing 
the dependence on non-perpendicular modes to increase signal to noise. The 
resulting power spectrum is reproduced in Fig. 56. It shows a nearly power-law 
behavior to the smallest scales measured, with no indication of an inflexion 
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at the non-linear scale, and no sign of turnover at the transition to the sta¬ 
ble clustering regime. Compared to the best fit CDM model (obtained from 
a joint analysis with CMB fluctuations in [650] and shown as a dashed line), 
the PSCz requires a significant scale-dependent bias. 

Finally, we briefly mention results on the parameter [3 ~ Tl°' 6 /bi from mea¬ 
surements of the anisotropy of the power spectrum in redshift space ! ld ‘ 1 (see 
Sect. 7.4.2). These measurements are complicated by the fact that surveys are 
not yet large enough to see a clear transition into the linear regime predictions, 
Eq. (614). In addition, different methods seem to give somewhat different an¬ 
swers [295]; however, the average and standard deviation of reported values 
are [295] (3 opt = 0.52 ± 0.26 and /3i ras = 0.77 ± 0.22 for optically selected and 
IRAS galaxies, respectively, which is roughly consistent with the relative bias 
between these two populations. On the other hand, the most recent results 
from the PSCz survey find f3 = 0.39 ± 0.12 [643], and (3 = 0 . 4 llo!i 2 [299]. 
Constraints from the most recent optically selected surveys are considerably 
noisier, e.g. Stromlo-APM does not even exclude (3 ~ 1 [412,639], and LCRS 
is consistent with no distortions at all, (3 = 0.30 ± 0.39 [443]. Resolution of 
these issues will have to await results from the full-volume 2dFGRS and SDSS 
surveys (see also Sect. 8.4). 

8.3.3 Three-Point Statistics 

Determination of three-point statistics from redshift surveys has been carried 
out mostly in the non-linear regime for optically selected surveys, and mostly 
in the weakly non-linear regime for IRAS surveys. Table 18 shows different 
estimates of the three-point function (top list) and the bispectrum (bottom 
list). 

As discussed before, the CfA sample covers a small volume to be a fair estimate 
of higher-order correlations. Even more so, estimates in the Durham-AAT 
and KOSS samples are subject to large estimator biases as they have only 
a few hundred redshifts. Nonetheless, these results roughly agree with each 
other, although the values of Q 3 are seen to fluctuate significantly. Note that 
the values in Table 18 are not directly comparable to those inferred from 
deprojection of angular catalogs (Table 15) as they are affected by redshift 
distortions (see e.g. Fig. 48). 

The LCRS survey provides the best estimate to date of the three-point func¬ 
tion at small scales [347]. Estimation of Q 3 in redshift space and in projected 
space (by integrating along the line of sight) showed values lower by a factor 
of about 2 than ACDM simulations where clusters have been underweighted 
by m ~ 0 08 , essentially equivalent to assuming that the number of galaxies as a 
function of dark matter halo mass m scales as N ga ] (m) ~ m 0 ' 9 in the notation 
of Sect. 7.1.3. The authors conclude that the hierarchical model is not a good 
description of the data, since they see some residual (small) scale and conhg- 


137 For an exhaustive review of these results up to mid 1997 see [295]. 
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Table 18 

Some measurements of Q 3 in redshift catalogs. In most cases, only the mean values 
over a range of scales were published. In cases where measurements of the individ¬ 
ual values for each scale are reported in the literature, we quote the actual range of 
estimates over the corresponding range of scales. The top half of the table is in con¬ 
figuration space, the bottom part in Fourier space. Scales are in Mpc/h and h/Mpc, 
respectively. When possible, we give estimates for equilateral (eq) and colinear (col) 
configurations. Error bars should be considered only as rough estimates, see text 
for discussion. _ 


3 Q 3 

Scales 

Sample 

Year 

Ref 

2.4 ± 0.2 

— 

CfA 

1980 

[508] (eq.[57.9]) 

2.04 ± 0.15 

— 

[541] 

1981 

[508] 

2.4 ± 0.3 

1-2 

CfA 

1984 

[198] 

1.8 ± 0.2 

1-3 

Durham-AAT 

1983 

[32] 

3.9 ±0.9 

1-2 

KOSS 

1983 

[32] 

1.5-4.5 

1-8 

LCRS 

1998 

[347] 

Qeq ^ 0.5 

0.1-1.6 

CfA/PPS 

1991 

[31] 

Q 3 ~ 1 

0.05-0.2 

QDOT 

2001 

[567] 

Qeq ~ 0.2; Qcol ~ 0.6 

0.05-0.2 

IRAS 1.9Jy 

2001 

[567] 

Qeq ~ 0.4; Qcol ~ 0.8 

0.05-0.2 

IRAS 1.2Jy 

2001 

[567] 

Qeq ~ 0.4; Q col ~ 1.4 

0.05-0.4 

PSCz 

2001 

[211] 


uration dependence. However, as discussed at the end of Sect. 7.4.3, one does 
not expect the hierarchical model to be a good description for correlations in 
redshift-space since velocity dispersion creates “fingers of god” along the ob¬ 
server’s direction [562]. The fact that these are clearly seen by visual inspection 
of the galaxy distribution ought to show up in a clear shape dependence of the 
three-point function: colinear configurations should be significantly amplified 
(see Fig. 48). Surprisingly, this is not seen in the LCRS measurements [347]. 
Measurements of the bispectrum (for equilateral configurations) in redshift 
space were first carried out for the CfA survey and a sample of redshifts in 
the Pisces-Perseus super-cluster [31]. This was the first measurement that 
reached partially into the weakly non-linear regime and compared the bis¬ 
pectrum for equilateral configurations with PT predictions, Q eq = 4/7. As 
shown in Fig. 57 the agreement with PT predictions is very good, even into 
the non-linear regimcf~ r3S ~|. The errors bars in each bin indicate the variance 


138 This is due to accidental cancellations in redshift space. At larger k' s, in the 
absence of redshift distortions, Q e q{k ) increases, see e.g. Fig. 15. However, velocity 
dispersion suppresses this rise, resulting in approximately the same value as in 
PT [562], The same is not true for colinear configurations, see Fig. 48. 
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Fig. 57. The redshift-space reduced bispectrum Q eq for equilateral triangles as a 
function for scale k, for CfA/PPS galaxies (from [31]). The dashed line shows the 
PT prediction: Q eq = 4/7. The other lines show predictions for the cooperative 
galaxy formation models, see Sect. 8.3.5. 

among different subsamples, 3 from the CfA and 3 from the Perseus-Pisces 
surveys . This result was interpreted as a support for gravitational instability 
from Gaussian initial conditions and in disagreement with models of thresh¬ 
old bias [21,344], which predicted Q% ~ 1. The results in Fig. 57 were later 
used in [224] to constrain models of non-local bias that had been proposed 
to give galaxies extra large-scale power in the standard CDM scenario (see 
Sect. 8.3.5 for a discussion). In addition, [31] measured the trispectrum for 
randomly generated tetrahedral configurations, showing a marginal detection 
with hierarchical scaling consistent with Q 4 ~ 1. 

Detailed measurements of the bispectrum in the weakly non-linear regime were 
not done until a decade later, with the analyses of the IRAS surveys [567,211], 
which probe a large enough volume of roughly spherical shape. In [567], mea¬ 
surements were done for the QDOT, 1.9Jy and 1.2Jy surveys. In order to con¬ 
strain galaxy bias and primordial non-Gaussianity, a likelihood method that 
takes into account the covariance matrix of the bispectrum for different trian¬ 
gles and the non-Gaussian shape of the likelihood function (see e.g. Fig. 42) 
was used, developed in [566]. This is essential to recover accurate estimates of 
errors on bias parameters and primordial non-Gaussianity without systematic 
estimator biases due to the finite volume of the survey p^] . The results from 

139 A likelihood analysis for analysis of the bispectrum was first proposed in [434], 
based on the Gaussian approximation for the likelihood function and a second-order 
Eulerian PT calculation of the covariance matrix. Extensions to redshift space are 
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Fig. 58. The bispectrum Q 3 vs. 6 for the PSCz catalog for triangles with 
0.2 < k\ < 0.4 h/Mpc and with two sides of ratio = 0.4 — 0.6 separated 

by angle 0. The solid curve shows Q 3 in redshift space averaged over many 2LPT 
realizations of the ACDM model. Symbols show results from the PSCz survey for 
bands in k\\ filled triangles, k\ = 0.20-0.24h/Mpc; filled squares 0.24-0.28; filled cir¬ 
cles 0.28-0.32; open circles 0.32-0.36; and open squares 0.36-0.42. The dashed curve 
shows the 2LPT prediction for ACDM with the best-fit bias parameters 1/6 = 1.20, 
62 / 6 2 = —0.42. Taken from [211], 

QDOT were marginal, due to the very sparse sampling (one galaxy every six) 
Q 3 was only shown to be of order unity without any discernible dependence 
on configuration. The results from 1.9Jy and 1.2Jy showed a systematic shape 
dependence similar to that predicted by gravitational instability. 

These results were considerably extended with the analysis of the PSCz bis¬ 
pectrum [211], Figure 58 shows the PSCz reduced bispectrum Q 3 as a function 
of the angle 6 between ki and k 2 for triangles with Aq//^ ~ 2 and different 
scales as described in the figure caption [211], The configuration dependence 
predicted by gravitational instability [232,313] (solid lines for an unbiased dis¬ 
tribution, predicted by 2LPT, see e.g. Fig. 48) is clearly seen in the data. This 
is also the case for all triangles, not just those shown in Fig. 58, see Fig. 1 
in [211], 

Implications of these results for galaxy biasing and primordial non-Gaussianity 
are discussed in Sect. 8.3.5. 


given in [669]. 


252 



Table 19 

Some measurements of S 3 and S 4 in redshift catalogs. In most cases, only the mean 
values over a range of scales were published. In cases where measurements of the 
individual values for each scale are reported in the literature, we quote the actual 
range of estimates over the corresponding range of scales. In most cases error bars 
should be considered only as rough estimates, see text for discussion._ 


S 3 

S 4 

Scales 

Sample 

Year 

Ref 

2 ± 1 — 6 ± 4 

— 

5-20 

QDOT 

1991 

[549] 

1.5 ±0.5 

4.4 ±3.7 

0.1-50 

IRAS 1.2Jy 

1992 

[88], [92] 

1.9 ±0.1 

4.1 ±0.6 

2-22 

CfA 

1992 

[246] 

2.0 ±0.1 

5.0 ±0.9 

2-22 

SSRS 

1992 

[246] 

2.1 ±0.3 

7.5 ±2.1 

3-10 

IRAS 1.9Jy 

1994 

[224] 

2.4 ± 0.3 

13 ±2 

2-10 

PPS 

1996 

[267] 

2.8 ±0.1 

6.9 ±0.7 

8-32 

IRAS 1.2Jy 

1998 

[377] 

1.8 ±0.1 

5.5 ± 1 

1-10 

SSRS2 

1999 

[35] 

1.9 ±0.6 

7 ± 4 

1-30 

PSCz 

2000 

[632] 

1.82 ±0.21 

~ 3 

12.6 

Durham/UKST 

2000 

[319] 

2.24 ±0.29 

~ 8 

18.2 

Stromlo/APM 

2000 

[319] 


8.3.4 Skewness, Kurtosis and Higher-Order Cumulants 

Table 19 shows different estimates for S 3 = £ 3/^2 and S 4 = £ 4 /^ 2 , the ra¬ 
tios of the cumulants £ N obtained by counts-in-cells. The shape of the cells 
correspond to top-hat spheres, unless stated otherwise. 

The QDOT results by [549] were obtained from counts-in-cells with a Gaussian 

window. The errors, from a minimum variance scheme, are quite large but they 

— —2 

suggest a hierarchical scaling £ 3 ~ £ 2 , with a value of S 3 consistent with gravity 
from Gaussian initial conditions, as argued in [137]. 

Figure 59 displays the 1.2 Jy IRAS results ([ 88 ], [92], left panel) and CfA-SSRS 
results ([246], right panel). There is a convincing evidence for the hierarchical 
scaling in £ 3 and £ 4 (denoted by straight lines) but the resulting S 3 and S 4 
amplitudes are probably affected by sampling biases (see discussion below). 
Note that the scaling is preserved well into the non-linear regime, this is in 
agreement with expectations from N-body simulations which show that in 
redshift space the growth of S p parameters towards the non-linear regime is 
suppressed by velocity dispersion from virialized regions ([391,437], see e.g. 
Fig. 49). 

In their analysis of higher-order moments in the CfA, SSRS, and IRAS 1.9 
Jy catalogs, [224] studied the sensitivity of S p to redshift distortions, by cal¬ 
culating moments in spherical cells and conical cells. The latter were argued 
to be less sensitive to the redshift space mapping that acts along the line of 
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f*(R) 


Fig. 59. Values of £ 3 (i?) and £ 4 (i?), as a function of £ 2 (-R) in the IRAS (left, from 
[92]) and in the CfA and SSRS (right, from [246]) redshift catalogs. The lines show 
the best fit amplitude for the hierarchical scaling £ N = Sn ^ 

sight ] 14u | . They find that although cumulants are sensitive to the change in 
cell geometry, the S p parameters were not. 

On the other hand, [267] estimated the third and fourth order cumulants using 
moments of counts centered in galaxies [84] in the PPS. After a somewhat ad- 
hoc correction for virial fingers to recover “real space” quantities, they find a 

variation of S 3 and S 4 with scale, compatible with a non negligible cubic term, 

— —2 —3 

e.g. £3 ~ S 3 £ 2+^*3 £ 2 - Since the scale where the cubic term becomes important 
is found to be about 5Mpc/h, this is perfectly consistent with gravitational 
clustering: at these scales loop corrections are expected to increase (the real- 
space) S 3 and S 4 , see e.g. Figs. 28 and 49. 

An alternative method to moments of count-in-cells was proposed in [377], 
who parameterized the count PDF by an Edgeworth expansion (see Sect. 3.5) 
convolved with a Poisson distribution to take into account discreteness effects. 
This method is only applicable at large enough scales (and small enough S/a) 
so that the Edgeworth expansion holds, however convolution with a Pois¬ 
son distribution helps to regularize the resulting PDF (i.e. it is positive def- 

140 This is certainly true in the limit of large radial distances. At finite size structures 
will still look less concentrated in conical cells than in real space due to velocity 
dispersion. Note that the conical geometry may introduce a change in since not 
all A-point configurations are weighted equally. 
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Fig. 60. The redshift-space skewness S 3 and kurtosis S 4 as a function of smoothing 
scale in the PSCz survey [211]. 

inite l 111 ] ). The advantage of this method is that one can obtain the S p from 
a likelihood analysis of the shape of the PDF near its maximum, rather than 
relying on the tails of the distribution which are sensitive to rare clusters, as 
in the moments method ^} One disadvantage is that error estimation in this 
framework is more complicated, although in principle not insurmountable. Re¬ 
sults from N-body simulations show this method to be more reliable at large 
scales [377] than the standard approach. Application to the 1 . 2 Jy survey [377] 
resulted in values for S 3 and S 4 significantly higher than in previous work 
using moments [92], see Table 19. 

Measurements of the higher-order moments in the SSRS2 survey were obtained 
in [35]. Results for S 3 and S 4 were shown to be consistent with hierarchical at 
all scales probed (the error bars quoted in Table 19 were found be averaging 
over all scales assuming uncorrelated measurements). A study of the errors in 
numerical simulations showed that bootstrap resampling errors were underes¬ 
timates by factor of order two. A re-analysis of the data using the Edgeworth 
method of [377] showed that S 3 changed upward by a factor of about two to 
S 3 ~ 3, similar to the change seen in the IRAS 1 . 2 Jy survey. 

A recent analysis of the PSCz survey [632], which should be affected much 

111 However, for future applications to surveys not as sparse as the IRAS galaxy 
distribution, such as 2dFGRS and SDSS, this will not be the case. 

142 The peak of the PDF is however sensitive to the largest voids in the sample (see 
e.g. Fig. 20), which can influence the most likely value of 5 and thus the S p derived 
from such method. 
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less than previous IRAS surveys by finite volume effects, was carried out by 
using minimum variance estimates of moments of counts-in-cells in volume 
limited subsamples (see Sect. 6.9). The values of S 3 and S 4 found, shown in 
Fig. 60, are consistent within the error s f^ 3 ] with that of previous IRAS results, 
including those found by deprojection from angular counts [451,92,239] and 
also (for S 3 ) in agreement with the amplitude obtained from measurements of 
the bispectrum [211] (see Fig. 58). They also found that the measurements of 
S 3 and S 4 agreed very well with the predictions of the semi-analytic galaxy 
formation model in [36], based on models of spiral galaxies in the framework 
of ACDM models. 

A similar analysis technique was used in the Stromlo-APM and Durham/UKST 
surveys [319]. In this case measurements of the skewness are in agreement with 
those found in shallower redshift surveys (CfA, IRAS 1.2Jy, SSRS2) but with 
larger (but more realistic) errors. Comparison with deprojected values for 
S 3 and S 4 obtained from the parents catalogs APM [249] and EDSGC [622] 
shows a systematic trend where redshift surveys give systematically smaller 
values than angular surveys. The most significant contribution to this appar¬ 
ent discrepancy is likely to be redshift distortions: as shown in Fig. 49 for 
scales R < 20 Mpc/h the S p parameters are suppressed in redshift space | 141 1 . 
At scales larger than 20 Mpc/h results from the redshift and parent angular 
surveys should agree, since redshift distortions do not affect the S p signif¬ 
icantly [313]. In this regime, the results from APM/EDSGC surveys seem 
systematically higher, although no more than la given the large error bars. 
I 11 this case other systematic effects might be taking place. Deprojection from 
angular surveys using the hierarchical model rather than the configuration 
dependence predicted by PT can cause an overestimation of the 3D S p that 
can be as much as 20% for S 3 (see e.g. Fig. 47). In addition, finite volume 
effects [621,328,630] as discussed in Chapter 6 can lead to underestimation of 
S p from redshift surveys that are typically sampling a smaller volume p 45 ] . 

8.3.5 Constraints on Biasing and Primordial Non-Gaussianity 
We now review implications of the above results for biasing and primordial 
non-Gaussianity, concentrating on higher-order statistics. Effects of primordial 
non-Gaussianity on the power spectrum have been considered in [212,605,612], 
The results presented here are complementary to recent studies of the impact 
of primordial non-Gaussian models in other aspects of large-scale structure 

113 One should take into account that errors in previous analyses have been un¬ 
derestimated. The more realistic errors in [632] were obtained using the FORCE 
code [621,152,630], which is based on the full theory of cosmic errors as described 
in Chapter 6. 

I 44 This is for dark matter, however at these scales bias should not make a qualitative 
difference. Furthermore, deviations in galaxy surveys are seen at similar scales [319]. 
145 These effects are thought to be dominant for smaller surveys such as CfA/SSRS, 
see [328] for a detailed discussion. 
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such as the abundance of massive clusters [538,385,691,522], 

Results on the redshift-space bispectrum in the CfA/PPS sample [31] (see 
Fig. 57) and the skewness of CfA/SSRS surveys [246] were used in [224] to 
put constraints on the non-local (scale-dependent) bias in the cooperative 
galaxy formation (CGF) model [96] proposed to generate enough large-scale 
power in the context of otherwise-standard CDM. This model corresponds to 
a (density-dependent) threshold bias model where galaxies form in regions 
satisfying 5 > va — k 5(R s ), where k is the strength of cooperative effects 
and R s describes the “scale of influence” of non-locality. Figure 57 shows the 
predictions of CGF models for (k, R s ) = (0.84,10 h^ 1 Mpc) (dot-long-dashed), 
(2.29, 20 h^ 1 Mpc) (solid) , and (4.48, 30 h^ 1 Mpc) (dot-short-dashed), all of 
which have similar large-scale power to a T = 0.2 CDM model. Because of 
the scale dependence induced by the CGF models, additional linear bias is 
required to suppress these features, which in turns implies non-zero non-linear 
bias to maintain agreement with Q 3 ~ 0.5 and also would be in disagreement 
with the normalization implied by the CMB [596]. In addition, this would 
make the agreement with the simple prediction of PT from Gaussian initial 
conditions purely accidental. Similar results follow from the analysis of the 
skewness S 3 , see [224], 

As discussed in Sect. 8.3.3, the detection of the configuration dependence of the 
bispectrum in IRAS surveys (see e.g. Fig. 58) gives a tool to constrain galaxy 
bias, primordial non-Gaussianity, and break degeneracies present in two-point 
statistics. Using a maximum likelihood method that takes into account the 
non-Gaussianity of the cosmic distribution function and the covariance matrix 
of the bispectrum [566], the constraints on local bias parameters from IRAS 
surveys assuming Gaussian initial conditions ! 140 1 read [567,211] 

l = 1.32«;g, | = -0.57«:S, (2Jy.) (630) 

4 = i.i5±8:l, [| =-0.5018:8!, (i.ajy.) (63i) 

4 = i.2oi8:l!, n =-°-42l8:ll, (pscz) (632) 

with the best fit model shown as a dashed line in Fig. 58 for the PSCz 
case. These results for the linear bias of IRAS galaxies, when coupled with 
measurements of the power spectrum redshift distortions, which determine 
(3 = ~ 0.4 ± 0.12 for the PSCd survey [299,643], allows the break of 

the degeneracy between linear bias and h2 m , giving Vt m = 0.16 ± 0.1. 

If bias is local in Lagrangian, rather than Eulerian, space the bispectrum 

146 In addition, these constraints assume a fixed linear power spectrum shape given 
by r = 0 . 21 , in agreement with power spectrum measurements. See [566,567] for sen¬ 
sitivity of bias parameters on the assumed power spectrum shape. The dependence 
of the bispectrum on the assumed Q m is negligible, as first pointed out in [313]. 
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shape depends differently on bias parameters [120], see Eq. (537). Physically 
this corresponds to galaxies that form depending exclusively on the initial 
density held, and then evolved by gravity. Eulerian bias, on the other hand, 
corresponds to the other extreme limit where galaxies form depending exclu¬ 
sively on the present (non-linear) density held. Both limits are undoubtedly 
simplistic, but analysis of the bispectrum in the PSCz survey suggest that the 
Eulerian bias model is more likely than the Lagrangian one [211], 

The bispectrum results can also be used to constraint non-Gaussian initial 
conditions. In this case one must also take into account the possibility of galaxy 
biasing, which is more complicated since the usual formula for Gaussian initial 
conditions, Eq. (528), is not valid anymore, but it is calculable in terms of the 
primordial statistics [565]. Using a y 2 model as an example of dimensional 
scaling models (where ~ ^ 2^1 see Sect. 4.4.2), it was shown that the IRAS 
1 . 2 Jy bispectrum is inconsistent with the amplitude and scaling of this type 
of initial conditions at the 95% level [567]. 

The PSCz bispectrum provides stronger constraints upon non-Gaussian ini¬ 
tial conditions. In [211] Xn statistics were considered as a general example of 
dimensional scaling models. For N — 1, this corresponds to the predictions of 
some inhationary models with isocurvature perturbations [513,7,404]; as N —> 
00 the model becomes effectively Gaussian, and for a hxed power spectrum 
(taken to ht that of PSCz) the primordial bispectrum obeys Q 1 oc N~ 1 / 2 [565]. 
From the PSCz data, it follows that N > 49(22) at 68 % (95%) CL. Since the 
primordial dimensionless skewness is B 3 = 2.46 for a y 2 held [514], the PSCz 
bispectrum constrains B 3 < 0.35(0.52). These results are independent of (lo¬ 
cal) biasing, and they are obtained by marginalizing over bias parameters [ 211 ]. 

8.4 Recent Results from 2 dFGRS and SDSS 

Looking at the overall picture, clustering statistics have been measured in a 
wide range of observational data. The catalogs listed in Tables 14 and 17 cover 
angular surface densities that are almost six orders of magnitude apart, solid 
angles ranging over more than three orders of magnitude, depths that go from 
50 to 2000 h Mpc, and volumes ranging over three orders of magnitude. They 
also involve quite different systematics, from photographic plates to satellite 
missions and different observational hlters. Despite these large differences, and 
after carefully correcting for systematic effects, all data on higher-order statis¬ 
tics in the weakly non-linear regime seems in good agreement with gravita¬ 
tional instability from Gaussian initial conditions. This provides a remarkable 
step forward in onr understanding of structure formation and points to grav¬ 
ity as the basic mechanism to build cosmic structure from small primordial 
fluctuations generated in the early universe. 

Needless to say, the observational results reviewed here, although providing 
a consistent picture, have significant limitations. The magnitude of statistical 
and systematic errors is still rather large and the range of scales available in 
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the weakly non-linear regime is quite restricted. In the next few years this is 
expected to change significantly, with the completion of the new generation 
of wide held surveys such as 2dFGRS and SDSS. Here we provide a brief 
summary of the results that have been recently reported in the literature from 
these preliminary samples. 

The 2dFGRS has recently publically released their first versions of galaxy and 
quasars catalogs, containing 100,000 [142] and 10,000 redshifts [167], respec¬ 
tively. The completed survey is expected to reach 250,000 galaxies and 25,000 
quasars. The parent source catalog is an extended and revised version of the 
APM survey [425], with galaxies with magnitudes bj < 19.45. For a review of 
the recent results see [496]. 

A measurement of the redshift-space two-point correlation function was pre¬ 
sented in [497] from analysis of 141,402 galaxies. Using a phenomenological 
model similar to that in Eq. (617) with input real-space power spectrum ob¬ 
tained by deprojection from the APM survey [27], they obtain a velocity dis¬ 
persion parameter a v = 385 km/sec and, after marginalizing over cr v , a best 
fit estimate of j3 — 0.43 ± 0.07. These results are obtained by considering only 
the two-point function data for 8 h~ l Mpc < r < 25 h _1 Mpc. 

A preliminary analysis of the redshift-space power spectrum is presented 
in [518], based on a sample of 147,024 galaxies. After taking into account 
the window of the survey, and assuming linear perturbation theory at scales 
0.02 < k < 0.15 h/Mpc, they obtain that models containing baryons os¬ 
cillations are marginally (~ 2 cr) preferred over featureless spectra. Assum¬ 
ing scale invariance for the primordial power spectrum, their analysis gives 
f l m h = 0.20 ± 0.03 and a baryon fraction Q b /Q m = 0.15 ± 0.07, in good 
agreement with recent determinations from measurements of the CMB power 
spectrum [476,282], The most recent analysis [652] of the publically released 
100,000 galaxy sample using KL eigenmodes finds however no significant detec¬ 
tion of baryonic wiggles, although their results are consistent with the previous 
analyses using a larger sample, but less sophisticated techniques. 

Using a series of volume-limited samples, [480] present a measurement of 
the projected correlation function by integrating the redshift-space two-point 
function along the line of sight. The result is well described by a power- 
law in pair separation over the range 0.1 h -1 Mpc < r < 10/r -1 Mpc, with 
r 0 = 4.9 ± 0.3 h~ l Mpc and 7 = 1.71 ± 0.06, see Eq. (626). Measurements for 
different samples spanning a factor of 40 in luminosity show a remarkable little 
variation in the power-law slope, with all correlation functions being almost 
parallel with amplitudes spanning a factor of about three. 

These results have been conhrmed by recent measurements in a preliminary 
sample of the SDSS survey [704] containing 29,300 galaxy redshifts. They find 
a scale-independent luminosity bias for scales r < 10 h Mpc, with different 
subsamples having nearly parallel projected correlation functions with power- 
law slope 7 ~ 1.8. For the whole sample, the correlation length is 77 = 6.1 ± 
0.2 h~ l Mpc and the power-law slope 7 = 1.75 ±0.03, for scales 0.1 h~ l Mpc < 
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r < 16/i _1 Mpc. The inferred velocity dispersion is a v ~ 600 ± 100 km/sec, 
nearly independent of scale for projected separations 0.15h _1 Mpc < r p < 
5 h~ l Mpc. 

A series of papers have recently analyzed angular clustering of over a million 
galaxies in a rectangular stripe of 2.5° x 90° from early SDSS data. The anal¬ 
ysis of systematic effects and statistical uncertainties is presented in [572], 
where the angular correlation function is calculated and the impact of sev¬ 
eral potential systematic errors are evaluated, from star/galaxy separation to 
the effects of seeing variations and CCD systematics, finding all of them to 
be under control. The Limber scaling test is performed and showed to make 
angular correlation functions corresponding to all four magnitude bins agree 
when scaled to the same depth f 1-47 ] . Analysis of statistical errors includes calcu¬ 
lation of covariance matrices for iy 2 ( 0 ) in the four slices using 200 realizations 
of mock catalogs constructed using the PTHalos code [571] and also using the 
subsampling and jackknife methods. 

Analysis of the angular correlation function is presented in [157], which is 
found to be consistent with results from previous surveys (see also [261]). On 
scales between 1 degree and 1 arcminute, the correlation functions are well 
described by a power-law with exponent of about -0.7, in agreement with 
Eq. (625). The amplitude of the correlation function within this angular inter¬ 
val decreases with fainter magnitudes in accord with previous galaxy surveys. 
There is a characteristic break in the correlation functions on scales close to 
1-2 degrees, showing a somewhat smaller amplitude at large scales (for the 
corresponding magnitude slice) than the APM correlation function. On small 
scales, less than an arcminute, the SDSS correlation function does not appear 
to be consistent with the same power-law fitted to the larger angular scales. 
This result should however be regarded as preliminary due to the still limited 
amount of data (only 1.6% of the final size of the SDSS photometric sample) 
and the uncertainties in modeling the covariance matrix of u^#) obtained 
from the mock catalogs described above. 

The angular power spectrum P 2 .d( 0 is obtained in [651] for large angular 
scales corresponding to multiple moments i < 600. The data in all four mag¬ 
nitude bins is shown to be consistent with a simple ACDM “concordance” 
model with non-linear evolution (particularly evident for the brightest galax¬ 
ies) and linear bias factors of order unity. The results were obtained using 
KL compression, quadratic estimators and presented in terms of uncorrelated 
band powers (Sect. 6.11). These results, together with those of the angular 
correlation function [572,157], are used in [189] to perform an inversion to 
obtain the 3D power spectrum, using a variant of the SVD decomposition 
method of [204] | i48 | with the corresponding covariance matrix computed from 


147 These correspond to r* = 18 — 19,19 — 20, 20 — 21, 21 — 22, with median redshifts 
z ~ 0.17,0.25,0.35,0.46 [189], 

148 See Sect. 8.2.3 for a brief discussion of inversion procedures and results. 
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the mock catalogs. The resulting 3D power spectrum estimates from both in¬ 
versions agree with each other and with previous estimates from the APM 
survey for 0.03h/Mpc < k < lh/Mpc. These results are shown to agree with 
an alternative method presented in [617], where the projected galaxy distri¬ 
bution is expanded in KL eigenmodes and the 3D power spectrum parameters 
recovered are T = 0.188 ± 0.04 and &i<t 8 = 0.915 ± 0.06. 

Preliminary results for the higher order correlations in the SDSS have been 
presented in [261,262,637], including s 3 , s 4 , q 3 and c i2 statistics. In all cases 
a very good agreement with previous surveys was found. In particular, at the 
bright end the agreement with the APM results is quite remarkable despite the 
important differences in survey design and calibration. These results confirm 
the need for non-trivial biasing at small scales, as discussed in Sects. 8.2.4-8.2.5 
(see also Fig. 54). 
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9 Summary &c Conclusions 


As illustrated throughout this work, PT provides a valuable tool to under¬ 
stand and calculate predictions for the evolution of large scale structure in 
the universe. The last decade has witnessed a substantial activity in this area, 
with strong interplay with numerical simulations of structure formation and 
observations of clustering of galaxies and, more recently, weak gravitational 
lensing. As galaxy surveys become larger probing more volume in the weakly 
non-linear regime, new applications of PT are likely to flourish to provide new 
ways of learning about cosmology, the origin of primordial fluctuations, and 
the relation between galaxies and dark matter. 

The general framework of these calculations is well established and calcula¬ 
tions have been pursued for a number of observational situations, whether 
it is for the statistical properties of the local density contrast, the velocity 
divergence, for the projected density contrast, redshift measurements or for 
more elaborate statistics such as joint density cumulants. All these results 
provide robust frameworks for understanding the observations or for reliable 
error computations. There are however a number of outstanding issues that 
remain to be addressed in order to improve our understanding of gravitational 
instability at large scales, 

- Most of the calculations have been done assuming Gaussian initial condi¬ 
tions, except for some specific cases such as y 2 models. Although present 
observations are consistent with Gaussian initial conditions, deriving quan¬ 
titative constraints on primordial non-Gaussianity requires some knowledge 
or useful parametrization of non-Gaussian initial conditions and how they 
evolve by gravity. 

- Predictions of PT for velocity held statistics are still in a rudimentary state 
compared to the case of the density held. Upcoming velocity surveys will 
start probing scales where PT predictions can be used. In addition, robust 
methods for calculating redshift distortions including the non-linear effects 
due to the redshift-space mapping are needed to fully extract information 
from the next generation of galaxy redshift surveys. 

- Another observational context in which a PT approach can be very valuable 
is the Lyman-a forest observed in quasar spectra. The statistical properties 
of these systems should be accessible to perturbative methods since most of 
the absorption lines correspond to modest density contrasts (from 1 to 10). 
This is a very promising held for observational cosmology. 

- Accurate constraints on cosmological parameters from galaxy surveys re¬ 
quire precise models of the joint likelihood of low and higher-order statistics 
including their covariance matrices. To date this has only been investigated 
in detail numerically, or analytically in some restricted cases. 

In addition, as we probe the transition to the non-linear regime, there are a 
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few technical issues that need more investigation, 


- Most results have been obtained in the tree-level approximation, for which 
systematic calculations can be done and the emergence of non-Gaussianity 
can be characterized in an elegant way. There is no such systematic frame¬ 
work for loop corrections, and only a few general results are known in this 
case. Furthermore, loop corrections are found to be divergent for power-law 
spectra with index n > — 1 , the interpretation of which is still not clear. 
Although this issue is irrelevant for realistic spectra such as CDM, its reso¬ 
lution may shed some light into the physics of the non-linear regime. 

- The SC collapse prescription (Sect. 5.5.2) leads to a good description of 
S p parameters in the transition to the non-linear regime when compared 
to N-body simulations and exact one-loop corrections when known. Is it 
possible to improve on this approximation, or make it more rigorous in any 
well-controlled way while maintaining its simplicity? 

- The development of HEPT (Sect. 4.5.6) and EPT (Sect. 5.13) suggests that 
there is a deep connection between gravitational clustering at large and 
small scales. Is this really so, or is it just an accident? Why do strongly 
non-linear clustering amplitudes seem to be so directly related to initial 
conditions? 

From the observational point of view, the next few years promise to be ex¬ 
tremely exciting, with the completion of 2dFGRS and SDSS and deep surveys 
that will trace the evolution of large-scale structure towards high redshift f ' 44 | . 
Observations of the so-called Lyman break galaxies [603] are should soon pro¬ 
vide a precious probe of the high-redshift universe, in particular regarding the 
evolution of galaxy bias [2,528,110]. Furthermore, weak lensing observations 
will provide measurements of the projected mass density that can be directly 
compared with theoretical predictions. In addition, CMB satellites and high- 
resolution experiments will probe scales that overlap with galaxy surveys and 
thus provide a consistency check on the framework of the growth of structure. 
Outstanding observational issues abound, most of them perhaps related to 
the way galaxies form and evolve. One of the most pressing ones, as discussed 
many times in Chapter 8 , is probably to have a convincing explanation of 
why correlation functions scale as power-law’s at non-linear scales. The scal¬ 
ing in Figs. 50 and 56 is certainly remarkable and preliminary results from 
2dFGRS [480] and SDSS [704,157] seem already to confirm and extend these 
results. In the CDM framework, however, this simple behavior is thought to 
be the result of accidental cancellation of the dark matter non-power-law form 
by scale-dependent bias due to the way dark matter halos are populated by 
galaxies (see discussion in Sect. 7.1.3). Although this may seem rather adhoc, 


149 See e.g. [130] for a recent assessment of how well upcoming deep surveys will 
determine correlation functions. 
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this model has, on the other hand, many observable consequences. The same 
weighting that makes the two-point function depend as a power-law of sepa¬ 
ration [579,495,570] suppresses the velocity dispersion and mean streaming of 
galaxies [591,592] as observed, see e.g. [349]. In addition, this weighting affects 
higher-order statistics in the non-linear regime, suppressing them in compar¬ 
ison with their dark matter counterparts [570] (see Fig. 45) as observed, see 
e.g. Fig. 54 for a comparison between dark matter and S p in the APM survey. 
There are also complementary indications that galaxies do not trace the un¬ 
derlying dark matter distribution at small scales from measurements of higher- 
order statistics. As discussed in Sect. 8.2.3, reconstruction of the linear power 
spectrum from galaxy surveys leads to significant disagreement of higher-order 
moments if no biasing is imposed at small scales, as shown in Fig. 54 for APM 
galaxies. A promising way to confirm that the underlying higher-order statis¬ 
tics of the dark matter are much higher than those of galaxies at small scales 
is by measuring higher-order moments in weak gravitational lensing. This will 
likely be done in the near future, as weak lensing surveys are already beginning 
to probe the relation between dark matter and galaxies [315]. 

In any case, statistical analysis of future observations are going to decide 
whether the small-scale behavior of correlations is dictated by biasing or if a 
new framework is needed to understand galaxy clustering at non-linear scales. 
What seems clear, whatever the outcome, is that the techniques described 
here will be a valuable tool to achieve that goal. 
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A The Spherical Collapse Dynamics 


The spherical collapse dynamics can be obtained from the Friedmann equa¬ 
tions of the expansion factor in different cosmologies. It amounts to solve the 
motion equation for the radius R of a shell collapsing under its own gravity, 


d 2 R _ M (< R) 

~dR~~ T ~R 2 


(A.l) 


where M(< R) is the mass encompassed in a radius R. The corresponding 
density contrast can be defined as, 


S sc (t) 


M(< R) 

pAn R 3 /3 


(A.2) 


Explicit solutions are known for open or closed universes without cosmological 
constant. The complete derivation of them can be found in [508] where the 
density contrast is expressed as a function of time t. We present the results 
here in a slightly different way by expressing the nonlinear density contrast as 
a function of the linear density contrast, e (= T> + (f)<5 init ) [43]. 

For an open universe the background evolution is described by parameter 0 O 
so that the current value of the density parameter is given by, 


ffo — 


2 

1 + cosh -0o 


(A.3) 


Similarly the density fluctuation is characterized by a parameter 6 . There 
is a minimal value of the linear density contrast below which the density 
fluctuation is still below critical and does not collapse. This is given by, 


9 sinh 0 o (sinh ip 0 — 0 O ) 
2 (cosh 0 O — l) 2 


(A.4) 


As a result, if the linear density contrast e > e min , the evolution of the pertur¬ 
bation density is given by, 


/ cosh 0 O — 1 \ 3 / — sin 9 + 9 \ 2 
\ — cos 6 + 1 J y sinh 0 O — ip 0 ) 


with 


e 


^min 


/ — sin 6 + 6 \ 2/3 
y sinh -0o -i>o) 


If e < e m i n we have, 

_ / cosh 'ipp - 1 \ 3 / sinh 9 - 8 \ 2 
y coshd — 1 J y sinh-00 — 00/ 


(A.5) 


(A.6) 


(A.7) 
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with 


/ sinh # — # \ 2/3 
\ sinh ijj a -ij)oj 


(A.8) 


The Einstein de Sitter case is recovered when ~^ 0. It implies that e m i n —> 0. 

In this case the solution reads, for e < 0, 


9 (sinh# — 6) 2 
An e) = --——-- 1 


2 (cosh# — l) 3 
e — —~ - (sinh 6 — 6) 


l 2/3 


and for e > 0, 


r ^ 9 (# — sin #) 2 i 

Ac( e ) 0 /i /i \3 I 

2(1— COS #) 3 

3 i 2 / 3 


3 

6_ 5 


-(# —sin#) 


(A.9) 

(A.10) 


(A.11) 

(A-12) 


In the limit ~ 1 ► 0 we have Ao oo. If implies that e m i n —^ 3/2. Moreover e 

is hnite when # is close to Ao so that, 


3 / exp # \ exp Ao 

2 l exp rfo) ’ sc_ exp# 


(A.13) 


which gives, 


(i _ 2t/3) 3 ' 2 


(A. 14) 


The case of a closed universe is obtained by the change of variable Ao - 1 ► h/V 


B Tree Summations 

In this appendix we present methods for performing tree summations. These 
calculations have been developed initially in different contexts (such as poly¬ 
mer physics, see e.g. [185]). In cosmology, these computations techniques have 
been introduced in [551] and presented in details in a more complex situation 
in [41]. 

B.l For One Field 

The issue we address is the computation of the sum of all tree diagrams (in a 
specific sense given in the following) connecting an arbitrary number of points. 
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More specifically we define tp(y) as (minus) the sum of all diagrams with the 
weight (—y) n for diagrams of n points. 

For computing the contribution of each order the rule is to build all possible 
minimal connection (that means n — 1 connections for n points) and to affect 
the value v p to points connected to p neighbors. The value of each diagram is 
then given by the product of the vertices v p it is composed of. 

The function p(y) then corresponds to the cumulant generating function, 

(p(y,V!,V2,...) = -J2(-y) n Y ( n i/p'j . (B.l) 

n=2 trees connecting n points V VClticeS / 


At the end of the calculation the value of U\ will be unity, but for the time 
being we assume it is a free parameter. Then ip is a function of y and of the 
vertices u p . We can then define r as 


1 d{-(p) 
-y dv\ 


(B.2) 


Like ip, (—r) is a function of y and of the vertices u p . This corresponds to all 
the diagrams for which one external line (connected to a v\ vertex) has been 
marked away. This is the sum of so called diagrams with one free external line. 
It is possible to write down an implicit equation for r, 

( r 2 (—r) p_1 \ 

— t — —y i v x - v 2 t T i/ 3 — + • • • + v v j— , +...)• (B.3) 

This equation expresses the fact that r can be reconstructed in a recursive 
way (see Fig. 25). Note the factor (jp — 1)! which corresponds to the symmetry 
factor. If one defines the generating function of the vertices, 


OO 


C(f> = Y u p 

p =1 


tii 

pi 


(B.4) 


then we have, 

r = -y^. (B.5) 

To complete the calculation we need to introduce the Legendre transform 
jC(t, u 2 , ...) defined as 

£(y, T,i/ 2 , ■ ■ ■) = ip + yviT. (B.6) 


It is important to note that C is viewed as a function of r and not of u 1 . We 
then have the remarkable property due to the Legendre transform, 


dC 

dV 


dp dv\ 

du { dr 


du 1 

+ y r + y = y ^ 


(B.7) 


267 



From Eq. (B.3) we have, 


oo (_ T )P -1 

yv l = r-y Y'Vp ( B - 8 ) 

which, after integrating relation (B.7), implies that, 

7 "2 °° f— q-\P 

£ = c+ — +y — r~ = c + V + vC(t) +yviT, (B.9) 

Z p =2 P- Z 

which leads to (the integration constant c = 0 is such that y>(y) -y 2 at 

leading order in y), 

<p(y) = v(( T ) - \v T C\r). (B.10) 

This equation, with Eq. (B.3), gives the tree generating function expressed as 
a function of the vertex generating function (. 

B.2 For Two Fields 


We can extend the previous results to joint tree summations. It corresponds 
to either 2 different fields taken at the same position (as the density and the 
velocity divergence for instance), or to two fields taken at different locations. 
We want to construct the joint generating function, ip(y 1 ,y 2 ), of the joint 
cumulants, 

TyVi > 2 / 2 ) / , C nm . , (B.ll) 

n,m,n+m>2 Tl] 


where C nm is the value of each cumulant. In this case for each diagram there 
are n vertices of type 1, and m of type 2. They take respectively the value u p 
and y q if they are connected respectively to p or q neighbors. Obviously if the 
two fields are identical the two series identify. Moreover in order to account 
for cell separation, a weight £ is put for each line connecting points of different 
nature. 

The generating function y> is then a function of 2 / 1 , 2 / 2 , -C, zq, • • •, AT,.... One 
can define the two functions T\ and r 2 by, 


1 d(-<p) r = 1 d(-<p) 
-yi dui ' T2 -y 2 dyi 


(B.12) 


It is easy to see that the functions T\ and t 2 are given respectively by, 


y- (-^r 1 ,, ^ (-T2)”- 1 


(B.13) 
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(B.14) 


T2=? \?A¥^L 


V 2 P 


-T 2 ) 


P-1 


-r(p- 1 )' 


This expresses the fact that there is a joint recursion between the two func¬ 
tions. A factor £ is introduced whenever a vertex of a given type is connected 
to vertex of the other type. 

Defining the Legendre transform as C = p + yiTiUi + y 2 T 2 /Ji, one obtains, 


<9£ 


<9£ 


^— = 2/1^1, = y 2 hi- 

C/Ti <7T 2 


(B.15) 


One should then solve the linear system for iq and /ii given by Eqs. (B.14, 
B.14). One eventually gets for p, 


<p = yiCi(n) + y2C2(r 2 ) 


2(1 -£ 2 


(a 2 - 2£tit 2 + t|) , 


(B.16) 


where 0 and <£ 2 are respectively the generating functions of z/ p and p p . This 
result can be rewritten in a more elegant form, 

1 1 

<p(y 1 , 2 / 2 ) = hiCi(o) + y 2 C 2 (r 2 ) - -yiTiCiC^i) - -y 2 r 2 C 2 (r 2 ). (B.17) 

If £ is unity, for instance for the computation of the joint density distribution 
of (5, 6 ), we have, 


t = n = r 2 = -yiC'(r) - jfeC'fa)- 


(B.18) 


B.3 The Large Separation Limit 

The other case of interest is when £ is small (which means that the correlation 
function at the cell separation is much smaller that the average correlation 
function at the cell size). 

It is then possible to expand p(yi, y 2 ) at leading order in £. The results reads, 


<p(yi,V2) = <Pi(yi) + ¥> 2 ( 2 / 2 ) - r[°\y 1 )^ , (y 2 


M, 


(B.19) 


where r^ and t 2 0 ^ are the respectively the functions T\ and r 2 computed when 

£- 0 . 

C Geometrical Properties of Top-Hat Window Functions 


In this section we recall the properties of top-hat window function. The deriva¬ 
tions are presented in a systematic way for any dimension of space D. The 
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window function Wd in Fourier space is given by 


W D {k) = 2°! 2 r (JD /2 + 1 ) 


(C.l) 


We are interested in computing the angle integrals of Wu{\l\ — I 2 I) times a 
geometrical function that can be expressed in terms of Legendre polynomials. 
In particular we want to compute /d D 0 Wr,(\h — I 2 I) 1 — (Ji-k) 2 /(l 2 l 2 ) and 

/ d D QWn(\li — h\) 1 + I 1 .I 2 /II ■ hi general the only angle the intervene in 
the angular integral, d D h2, is the relative angle ip so that, d D r2/r2 tot . reduces 
to T{D/2)/{y/irT[{D — l)/2]) sin(<£>) D ^ 2 d<^, 0 < ip < 7r. 

In order to complete these calculations, we need the summation theorem (GR, 
8.532) for Bessel function, 


Jv(\h k\)_ _ {v + k ) ^+ fc (^ 2 
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Ck(c os(p) 


(C.2) 


where C% are Gegenbauer polynomials. Note that in case of u = 0 the previous 
equations reads, 

OO 

M\h -k\) = Mh) Mh) + 2 E Mh) Mh) cos (kip). (c.3) 

k= 1 


In the following, the only property of interest for the Gegenbauer polynomials 
is (GR, 7.323) 
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The only non-vanishing term of this summation is the one corresponding to 
k = 0. We finally have, 
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This result writes as a kind of commutation rule: the filtering can be applied to 
the wave vectors separately provided the angular kernel is properly averaged. 
The second relation can be obtained from the observations that 


d JD/2—1 (0 
d l l D ' 2 ~ x 


Jd/ 2(0 
1 °/ 2 ’ 


JD/ 2 -l(l) 

P / 2 - 1 


, d_ Jpj_ 2(0 
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° /D/2 


(C.8) 

(C.9) 


The summation theorem applied to Jd/ 2 -i(|Zi + ^21)/ Ui + h ] 1 ^ 2 \ loads to, 
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Taking the derivative of this equality with respect to b leads to, 


d D ff 


^tot. 

Wd(/i 


W^dZi-Zal) 


h-h 

T" 


fC D (/ 2 ) + -^--^W D (/ 2 ) 
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D One-Loop Calculations: Dimensional Regularization 


To obtain the behavior of the one-loop p-point spectra for n < —1, one can 
use dimensional regularization (see e.g. [143]) to simplify considerably the 
calculations. Since we are interested in the limit where the ultraviolet cutoff 
k c —> 00, all the integrals run from 0 to 00, and divergences are regulated by 
changing the dimensionality d of space: we set d = 3 + £ and expand in £ <C 1. 
For example, for one-loop bispectrum calculations, we need the following one- 
loop three-point integral: 

J(V " V2 ’ VZ) “ / ( 9 2 M(k 1 -q) 2 H(k 2 -q) 2 ]«' (M) 


When one of the indices vanishes, e.g. z/ 3 = 0, this reduces to the standard 
formula for dimensional-regularized two-point integrals [595] 


J{V\ 1, ^2,0) 


T(d/2 - vi)T(d/2 - u 2 )T(u 1 + u 2 - d/2 ) 
Y(u 1 )T(u 2 )Y(d - - v 2 ) 


n d /2 k d- 2 *l- 2 » 2 ^ D ' 2 ) 


which is useful for one-loop power spectrum calculations. The integral J(u 1, zz 2 , z/ 3 ) 
appears in triangle diagrams for massless particles in quantum field theory, 
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and can be evaluated for arbitrary values of its parameters in terms of hyper¬ 
geometric functions of two variables. The result is [178] 


7T^/2 Jx^ 2^123 / 
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where z/ 423 = z/ 4 + z/ 2 + z/ 3 , z/^ = z/; + i/j, x = (k 2 - k 4 ) 2 /k\, y = /c|//c 2 , and F 4 
is Apell’s hypergeometric function of two variables, with the series expansion: 


F 4 (o, b; c, d; x, y) = 
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where (a), = r(a + i)/r(a) denotes the Pochhammer symbol. When the spec¬ 
tral index is n = —2, the hypergeometric functions reduce to polynomials in 
their variables due to the following useful property for —a a positive integer: 
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When using expressions such as Eq. (D.3), divergences appear as poles in 
the gamma functions; these can be handled by the following expansion (n = 
0 ,1, 2,... and e —► 0): 


r (—n + e) 
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plus terms of order e 2 and higher. Here = dlnr(x)/dx and 

^{n + 1) = 1 + ^ + ••• + -- 7e, (F>-7) 
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with ■?/(!) = — 7 e = —0.577216 ... and ^'(l) = 7 t 2 /6. 
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E PDF Construction from Cumulant Generating Function 

In this section we present the mathematical relation between the cumulant 
generating function defined in section 3 and the one-point probability distri¬ 
bution function of the local density, and more generally the counts in cells 
probabilities. 

In this presentation we follow the calculations (and most of notations) devel¬ 
oped in [16]. 

E.l Counts-in-Cells and Generating Functions 

Let us consider a cell of volume V placed at random in the field. We note 
P(N) the probability that this cell contains N particles. One can define the 
probability distribution function V(X) with, 


V(X) = ]T X n P(N). 

N =0 


(E.l) 


By construction the counts in cells probabilities P(N) are obtained by a Taylor 
expansion of V(X) around A = 0, 

p m = = 0) - (E - 2) 

Remarkably the (factorial) moments of this distribution are obtained by a 
Taylor expansion of V(X) around A = 1, 

V(l) = l 

ifip(l) = {jV(JV-l)) 

■^V(l)=(N(N-l)...(N-p+l)). (E.3) 

If the field is an underlying Poisson distribution of a continuous field, then the 
factorial moments, (N(N — 1)... (N — p+1)), are equal to N p M p where M p is 
the p th moment of the local density distribution. 'P(A) can therefore be written 
in terms of the moment generating function (see sec. 3.3.3), V(X) = At [(A — 
1 )N], which in turns can be written in terms of the cumulant generating 
function, C(X — 1), 

V{ A) = exp (c[(A - 1 )N}) . (E.4) 
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When the cumulant generating function is written in terms of the S p generat¬ 
ing function, the counts in cells read, 


P( N ) = j 2^[~\N+i ex P - 


N £(1 — A) + (p(N £(1 — A)) 


where the integral is made in the complex plane around the singularity A = 0. 
One can change the variable to use y = N£(l — A), so that, 


P(N) 


zkfN (i-JL) 

ml2n\ N() 


~(N+ 1 ) 

exp 


y + <p{y) 


(E.6) 


E.2 The Continuous Limit 


The contributing values for y are finite, so that, in the continuous limit A 
should be close to unity. As a result one can write 

/ \ JV+l r / \ i In' 

=a£ p[- (JV + i ) i° g ( 1 -j||jJ (K7) 

It implies that 

r> f m d P T d y \ y + <p(y) , yp\ ^ 0 \ 

P(f)ip = ~j J 27 ri ex P- + J ' ( 1 

—ioo 

This is the inverse Laplace transform. 

It is important to note that the counts in cells P(N) can be recovered by a 
Poisson convolution of the continuous distribution. A Poisson distribution is 
given by, 

N) = — e- N = j—— exp (-JV(1 - A)) (E.9) 

Then 

J dp P(p) P Poisson (N, Np) = 

(E - io) 

—ioo 

The integration over p leads to doiraciy — iV^(l — A)), which finally implies, 

J d P P(p) Ppoisson(iV, Np) = P(N). (E.ll) 

This is not surprising since we assumed from the very beginning that any 
discrete field would be the Poisson realization of a continuous field. 
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E.3 Approximate Forms for P(p) when (Cl 


In this paragraph, we review the various approximations that have been used 
for P(p). It obviously depends on the regime we are interested in, that on the 
amplitude of the density fluctuations £. 

When £ is small, it is possible to apply a saddle point approximation. This 
point is defined by 


Ps 


d <p(y s ) 

d y 


(E.12) 


It leads to 


P(P) 


1 

^/-2 t Ttfip"(y s ) 


exp 


-g{ys + <p(ys) 


ysv'ivs)) 


(E.13) 


In case <p(y) is a obtained through a tree summation, as for the weakly non¬ 
linear regime, one finally gets the formula (312). 

Obviously such a result makes sense only if (p"(y) is negative. Because of the 
presence of a singular point on the real axis this will not be always the case. 
In practice it will be true only for values of the density smaller than a critical 
value, p c . These values are given in table 9 for the results obtained in the 
quasi-linear regime. 

For p > p c the shape the saddle point position is pushed towards the singular¬ 
ity. The behavior of the PDF will then be dominated by the behavior of (p(y) 
around this point. Let us write generally (p(y) as, 

<p(y) = ( fs + r s (y - y 8 ) + • • • - a s (y - y s ) Us (E.14) 


where the expansion around the singular point has been decomposed into its 
regular part, (p s + r s (y — y s ) + ... and singular part a s (y — y s ) Us , where w s is a 
non-integer value (w s = 3/2 in the quasi-linear theory). In E.8 the integration 
path for y will be pushed towards the negative part of the real axis (y < y s ). 
It can thus be described by the real variable u varying from 0 to oo with, 

y = y s + u e ±17r (E.15) 


where the sign is changing according to whether y is above or under the real 
axis. Expanding the singular part in the exponential one gets, 


p(p) 



J d u iF s 

o 


e ±i7r(uj s -l) 


27ri 


exp 


P ~ r s 
£ 



(E.16) 
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which gives, 


p(p) 


CL S 

n-u s )e 


P - r s 

£ 


— COs— 1 

exp 




(E. 17) 


taking advantage of the relation, r(u; s + l)r(— ou s ) = — sin(7ro; s )/7r. For the 
parameters describing the quasi-linear theory one gets the relation (314). 


E.f Approximate Forms for P(p) when £ 1 

Two scaling domains have been found (see [16] for a comprehensive presen¬ 
tation of the scaling laws). One corresponds to the rather dense regions. It 
corresponds to cases where <p(y) is always finite in E.8. For large values of £ 
it is therefore possible to write, 

1 +i °° d 

p (p) = TT j 7 ^r<d(y) exp(xy) with x = p/f. (E.18) 

—ioo 


One can see that the PDF is a function of x only. Roughly speaking, in this 
integral, y ~ 1/x so that the validity domain of this expression is limited to 
cases where ip( 1/x) <C £. It will be limited to a regime where 

x » (e/a) 1/{1_aj) , (E.19) 


if (p(y) behaves like ay 1 u at large y. 

If x is small, in a regime where <p(y) can be approximated by its power law 
asymptotic shape, the PDF eventually reads, 


P(P) 


1 q(l - u) 2 _ u 

e r(w) 


(E.20) 


For large values of x , one recovers the exponential cut-off found in the previous 
regime, (E.17), with further simplifications since £ 3> 1, 


p(p) 


&s 

U-uOc 2 



(E.21) 


The second scaling regime corresponds to the underdense regions. They are 
described by the asymptotic form of (p(y), which implies, 


p(p) 


+ioo 


—1 /' ay 

~T J 2^ri CXP 



£ 



(E.22) 
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A simple change of variable, t 1 w = y 1 w £/a, shows that it can be written, 

1 / \ —1/(1— lj ) +ioo 

p ^) = y(fj / ^[ ex p(~ tl ~ u + zt ) ( E - 23 ) 

' / —ioo 

with 

/ \ 1/(1—Ul) 

z = w Pr " (?) ’ (E ' 24) 

which can be written, 

1 00 

P(p) = - f du sinfu 1 ^ sin nu\ e~ zu+ul ~“ cos ™ . (E.25) 

Tlp v J 

For large values of z, the power law behavior of (E.19) is recovered, and the 
two regimes overlap. 

Small values of z however describe the small density cut-off. The expression 
of the PDF can be obtained by a saddle point approximation, and it appears 
to be a particular case of the results obtained in Eq. (E.13). Note that the 
shape of this cut-off depends only on u, 

1 ( 1 — (jJ ^ 2cj ii r 1 — uj 1 — ui 1 

P{p) = — r— z~2~^ exp -u}(l-u))— z— . (E.26) 

p v v2m o L J 

E. 5 Numerical Computation of the Laplace Inverse Transform 


The starting point of the numerical computation of the local density PDF 
from the cumulant generating function is equation (E.8). In case the cumulant 
generating function can be obtained from a vertex generating function Q, the 
latter is the natural variable to use. The technical difficulty is actually to 
choose the path to follow in the y or Q complex plane. The original path for y 
runs from — ioo to +ioo along the imaginary axis. But as the functions r(y) or 
<p(y) are not analytic over the complex plane (there is at least one singularity 
on the local axis for y = y s < 0) the crossing point of the path with the real 
axis cannot be moved to the left side of y s (otherwise the PDF would simply 
vanish!). Actually the crossing point of the path for the numerical integration 
is the position of the saddle point, y sa ddie defined by, 


0 


P~ 1 - 


d <p(y) 
d y 


V ^/saddle 


(E.27) 


This equation has a solution as long as p < 1 + S c and it is then at a point 
l/saddie > Vs (see Section 5.8). In case of p > 1 + 5 C the crossing point of the 
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integration path is then simply chosen to be y = y s . The integration path is 
subsequently built in such way that fry — 1 — ip(y) is kept real and negative to 
avoid unnecessary oscillation of the function to integrate. In practice the path 
is built step by step with an adaptive integration scheme [44,151]. 

F Cosmic Errors: Expressions for the Factorial Moments 

In this appendix, we first explain how the cosmic error on the factorial mo¬ 
ments of count-in-cells is calculated. We then list the corresponding analytic 
expression for the cosmic covariance matrix up to third order in the three- 
dimensional case. 

F. 1 Method 

From now, to simplify we assume that the cells are spherical (or circular, in two 
dimensions), but the results are valid in practice with the obvious appropriate 
corrections for any compact cell. 

The local Poisson assumption allows us to neglect correlations inside the union 
Gj of volume Uy of two overlapping cells and the non-spherical contribution 
of Gy. As a result, the generating function for bicounts in overlapping cells 
reads [621] 

P over(•£) y ) = Pu [q{x + y) + pxy } . (F.l) 

The generating function V\j(x) is the same as P(x) but for a cell of volume 
Uy, and 

P=[ 1 - fv(r/R)\/[ 1 + fv(r/R)], q = fv(r/R)/[ 1 + fv{r/R)\, (F.2) 

where f v (r/R) represents the excess of volume (or area) of v u compared to 


vr, 

v u = v R [ 1 + f v {r/R)\, (F.3) 

and r is the separation between the two cells. We have / 3 (^) = (3/4)^ — 
(1/16)^ 3 , and and / 2 ( , 0) = 1 — (1/ 7r) 2arccos(^/2) — y / T— vjF/2 in three 
and two dimensions, respectively. 

The generating function for disjoint cells is Taylor expanded 

disjoint (x,y) ~ V(x)V(y) [1 + K(x,y)] + 0(^), (F.4) 

with 

OO Q 

= Z E (F.5) 

M=l,N=l . 1V1. 
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It is then easy to calculate cross-correlations on factorial moments, A k,i, by 
computing the double integral in Eq. (449) after applying partial derivatives in 
Eq. (447), with the further assumption that the two-point correlation function 
is well approximated by a power-law of index —7 ~ — 1.8 for r < 2 R . 150 


F.2 Analytic Results 


The cosmic errors for the factorial moments as discussed in Sect. 6.7.4, Eq. (451), 
are now detailed here, up to third order (in the three-dimensional case): 


A 

A 

A 

A 

A 

A 

A 


F 

11 

E 

11 


D 

11 

F 

22 

E 

22 


D 

22 

F 

33 




N 2 aL), (F.6) 

5.508A 2 ^, (F.7) 


(F.8) 

4iV 4 a'L) (l + 2 1Q 12 + f Q 22 ) , (F.9) 

A 4 (17.05 + 3.417^+45.67^3 + 42.24£ 2 g 4 ) (F.10) 

A 2 -^ (0.648 + 4 A + 0.502 £ + 8.871 N£ + 6.598 A 2 £ 2 Q 3 J , (F.ll) 

9A 6 £(L) (l + 2£ + £ 2 + 4£ Q 12 + 4 £ 2 Q 12 + 6 A Q 13 
+6 f Q 13 + 4 f Q 22 + 12 f Q 23 + 9 e 4 Q 33 ) • (F.12) 

A 6 ^ (34.62 + 99.26^+39.60 C 2 + 180.3 ^Q 3 +331.1 pQ 3 + 
93.50 £ 3 Q 3 2 + 633.5 £ 2 Q A + 441.3 £ 3 Q A + 

1379 f Q 5 + 1668 £ 4 Q 6 ), (F.13) 

A 3 (o.879 + 5.829 A + 9. A 2 + 2.116 f + 27.13 A£+ 

66.53 A 2 1 + 10.59 Ni 2 + 74.23 A 2 £ 2 + 1.709 1 2 Q 3 + 

42.37 A £ 2 Q 3 + 148.5 A 2 £ 2 Q 3 + 111.2 A 2 f 3 Q 3 + 

44.40 A f Q 4 + 296.4A 2 f Q 4 + 349.3 A 2 ^ 4 Q 5 ) . (F.14) 


The cosmic cross-correlations read 


A F u = 2N 3 aL) (l + £Qi 2 ), (F.15) 

Af 2 = A 3 ^ (8.525 + 11.42 £ g 3 ) , (F.16) 

A? 2 = A 2 ^ (2.0 + 1.478e), (F.17) 

a 43 = 3 A 4 |(l) (i + ^+2^g 12 + 3pgi 3 ), (F.i8) 

A i3 (9.05 + 11-42 £+ 21.67 £Q 3 + 42.24 e 2 g 4 ) , (F.19) 


150 The results do not depend significantly on the value of 7 [621]. 
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A? 3 = N 3 y (3.0 + 6.653 f + 4.949 f Q 3 ) , (F.20) 

A 23 = 6 TV 5 £(T) (l + £ + 3 £ Q 12 + 3 £ 2 Q 13 + £ 2 Q 12 + 

2| 2 g 2 2 + 3p Q 23 ), (F.21) 

A| = iV 5 ^ (23.08 + 33.09 f + 90.17£ g 3 + 55.19 £ 2 g 3 + 

211.2 fg 4 + 229.9 fg 5 ) , (F.22) 

A° = IV 3 ^ (l.943 + 6. TV + 4.522 f + 26.61 TV f + 9.898 TV £ 2 + 

3.531 £ 2 g 3 + 39.59 N£ 2 Q 3 + 39.53 N f 3 g 4 ) . (F.23) 


Note that the finite volume effect terms A£ ? would be the same in the 2-D 
case. In the above equations, £(L is the integral of the two-point correlation 
function over the survey volume [Eq. (452)] and 


C) = S n 

^ N ~ jyN-2 ’ 


g^M = 


c 


NM 




(F.24) 


Note that these Qm and Qnm are slightly different from what was defined in 
Eqs. (150) and (214). They are also often used in the literature instead of S p 
or C pq . 

An accurate approximation for £(L) is [153,154] 


f(£) - i J d D r 1 d D r 2 ^(r 12 ) - i j <Pr£{r). 

y r<2R 


(F .25) 


This actually means that, rigorously, the finite volume error as we defined it 
here actually contains an edge effect term. For practical calculations, however, 
the following approximation generally works quite well 

&) ct £(L), (F.26) 

where £(L) was defined in Eq. (389). 
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